GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



July 15, 2004, 16:25:44 ; Search time 27.1642 Seconds 

(without alignments) 
540.877 Million cell updates/sec 

US-09-423-100-5 
294 

1 FVNQHLCGSHLVEALYLVCG IVEQCCTSICSLYQLENYCN 52 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



1586107 



Searched: 1586107 seqs, 282547505 residues 

Total number of hits satisfying chosen parameters: 
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RESULT 1 
AAY42859 

ID AAY42859 standard; protein; 52 AA. 
XX 

AC AAY42859; 
XX 

DT 19-JAN-2000 (first entry) 
XX 

DE Human insulin precursor, SEQ ID 5. 
XX 

KW Insulin; precursor; growth hormone; chaperone; intramolecular; folding; 

KW conformation; chimeric protein; cleavable; recombinant; production; 

KW yield. 
XX 

OS Homo sapiens. 
XX 



PN WO9950302-A1. 
XX 

PD 07-OCT-1999. 
XX 

PF 31-MAR-1998; 9 8WO-CN0 0 0052 . 
XX 

PR 31-MAR-1998; 98WO-CN0 0 0 052 . 
XX 

PA (TONG-) TONGHUA GANTECH BIOTECHNOLOGY LTD. 
XX 

PI Gan Z; 
XX 

DR WPI; 1999-610839/52. 
XX 

PT New chimeric proteins containing human growth hormone fragment, used 

PT particularly for the production of human insulin. 

XX 

PS Claim 12; Page 29-30; 46pp; English. 
XX 

CC This sequence represents a human insulin precursor comprising insulin A 

CC and B chains. This insulin precursor is a component of the chimeric 

CC proteins hGH-mini-proinsulin (AAY42860) and the chimeric protein given in 

CC AAY42861. These chimeric proteins additionally contain an N-terminal 

CC fragment of human growth hormone (hGH) and a cleavable peptide linker 

CC (AAY42857) . The hGH portion of the chimeric protein acts as an 

CC intramolecular chaperone (IMC) for the insulin precursor, enabling it to 

CC fold correctly. The cleavable peptide linker has a C-terminal Arg residue 

CC which enables the hGH portion of the chimeric protein to be removed after 

CC folding has taken place. Production of recombinant human insulin via an 

CC hGH-proinsulin chimeric protein can provide human insulin with correctly 

CC linked cysteine bridges with fewer necessary procedural steps, and hence 

CC resulting in a higher yield of human insulin. The IMC sequences not only 

CC protect insulin sequences from intracellular degradation by a 

CC microorganism host, but also promote the folding of the fused insulin 

CC precursor, facilitate the solubility of the fusion protein and decrease 

CC the intermolecular interactions among the fusion proteins, thus allowing 

CC folding of the fused insulin precursor at commercially useful high 

CC concentrations. The procedural steps of cyanogen bromide cleavage, 

CC oxidative sulphitolysis and related purification steps can thus be 

CC eliminated, along with the use of high concentrations of mercaptan or the 

CC use of hydrophobic absorbent resins 

XX 

SQ Sequence 52 AA; 

Query Match 100.0%; Score 294; DB 2; Length 52; 
Best Local Similarity 100.0%; Pred. No. l.le-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I II I! I I I I I I M I I I I I II I I II I I I I II II I I I I I I I I I I I II I II I I 

Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 



RESULT 2 
AAR68901 

ID AAR68901 standard; peptide; 56 AA. 
XX 



AC AAR68901; 
XX 

DT 25-MAR-2003 (revised) 

DT 02-MAR-1995 (first entry) 

XX 

DE Human pro-insulin 3. 
XX 

KW Pro-insulin; A-chain; B-chain; C-chain; disulphicie; mercaptan; 

KW chaotropic agent. 

XX 

OS Homo sapiens. 
XX 

PN EP600372-A1. 
XX 

PD 08-JUN-1994. 
XX 

PF 25-NOV-1993; 93EP-00118993 . 
XX 

PR 02-DEC-1992; 92DE-0424 042 0 . 
XX 

PA (FARH ) HOECHST AG. 
XX 

PI Obermeier R, Gerl M, Ludwig J, Sabel W; 
XX 

DR WPI; 1994-177718/22. 
XX 

PT Prodn. of pro-insulin with correct di: sulphide bridges - by treating 

PT recombinant precursor protein with mercaptan in alkali and in presence of 

PT chaotropic agent, then isolation on hydrophobic resin. 

XX 

PS Disclosure; Page 12; 15pp; German. 
XX 

CC Pro-insulin is produced by treating recombinant precursor protein with a 

CC mercaptan to provide 2-10 SH residues per Cys residue, in presence of a 

CC chaotropic agent and in aq. medium of pH 10-11, treating the prod, with 3 

CC -50 g hydrophobic adsorber resin per 1 aq. medium of pH 4-7, isolating 

CC the adsorbed resin and pro-insulin and desorbing the pro-insulin. This 

CC method produces pro-insulin with correctly bonded Cys bridges. Compared 

CC with known methods it involves fewer stages (esp. no sulphitolysis or 

CC cyanogen bromide cleavage) and overall losses during purification are 

CC reduced, i.e. the process is quicker and gives better yields. Sequences 

CC of insulin chain A, B and C are given in AAR68895-97. Sequences of pro- 

CC insulin 1-4 are given in AAR68898-901 . (Updated on 25-MAR-2003 to correct 

CC PN field. ) 
XX 

SQ Sequence 56 AA; 

Query Match 100.0%; Score 294; DB 2; Length 56; 
Best Local Similarity 100.0%; Pred. No. l.le-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I 

Db 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 56 



RESULT 3 



AAR7 8 665 

ID AAR78665 standard; protein; 56 AA. 
XX 

AC AAR7 8665; 
XX 

DT 03-APR-1996 (first entry) 
XX 

DE Proinsulin sequence 3. 
XX 

KW Proinsulin; pos t-translational modification; recombinant production; 

KW protein folding; conformation. 

XX 

OS Synthetic. 
XX 

FH Key Location/Qualifiers 

FT Region 1. .4 

FT /label= R2 

FT /note= "a peptide of 4 amino acids" 

FT Peptide 5. .34 

FT /label= Rl- (B2-B29) -Y 

FT /note= "human insulin B-chain" 

FT Region 35 

FT /label= X 

FT Peptide 36. .56 

FT /label= Gly- (A2-A2 0 ) -R3 

FT /note= "human insulin A-chain" 

XX 

PN EP668292-A2. 
XX 

PD 23-AUG-1995. 
XX 

PF 09-FEB-1995; 95EP-00101748 . 
XX 

PR 18-FEB-1994; 94DE-04405179 . 
XX 

PA (FARH ) HOECHST AG. 
XX 

PI Obermeier R, Gerl M, Ludwig J, Sabel W; 
XX 

DR WPI; 1995-284754/38. 
XX 

PT Isolation of insulin that is correctly pos t-translationally processed - 
PT by reacting pro: insulin with a mercaptan in the presence of a chaotropic 
PT agent and purificn. after absorption to hydrophobic resin. 
XX 

PS Example 2 ; Page 13; 16pp; German . 
XX 

CC The present sequence is an example of a proinsulin molecule corresp. to 
CC the general formula R2-R1- ( B2-B2 9 ) -Y-X-Gly- (A2-A2 0 ) -R3 (II). In formula 
CC (II), X = Lys, Arg or a peptide of 2-35 amino acids contg. Lys or Arg at 
CC the N- and C-termini; Y = a natural amino acid; Rl = Phe or a bond; R2 = 
CC H, Arg, Lys, a peptide of 2-45 amino acids contg. Arg or Lys at the N- 
CC and C-termini; R3 = a natural amino acid; (A2-A2 0) and (B2-B2 9) are the 
CC insulin A- and B-chain sequences from human or other insulin. The 
CC proinsulin molecule (produced in recombinant E.coli) is reacted with 
CC mercaptan at a ratio of 2-10 SH residues of mercaptan per Cys residue of 
CC proinsulin. The reaction takes place in the presence of a chaotropic 



CC auxiliary agent at pH 10-11 and results in proinsulin with correctly 

CC linked cystine bridges. Reaction with trypsin and opt. carboxypeptidase B 

CC yields correctly folded insulin. The insulin is isolated by absortion on 

CC a hydrophobic resin 

XX 

SQ Sequence 56 AA; 

Query Match 100.0%; Score 294; DB 2; Length 56; 

Best Local Similarity 100.0%; Pred. No. l.le-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 56 
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DM 
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PF 


25-NOV-1993; 93EP-00118 993 . 
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PR 


02-DEC-1992; 92DE-0424 042 0 . 
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PA 


(FARH ) HOECHST AG. 


XX 




PI 


Obermeier R, Gerl M, Ludwig J, Sabel W; 


XX 




DR 


WPI; 1994-177718/22. 


XX 




PT 


Prodn. of pro-insulin with correct di : sulphide . bridges - by treating 


PT 


recombinant precursor protein with mercaptan in alkali and in presence of 


PT 


chaotropic agent, then isolation on hydrophobic resin. 


XX 




PS 


Disclosure; Page 11-12; 15pp; German. 


XX 




CC 


Pro-insulin is produced by treating recombinant precursor protein with a 


CC 


mercaptan to provide 2-10 SH residues per Cys residue, in presence of a 


CC 


chaotropic agent and in aq. medium of pH 10-11, treating the prod, with 3 


CC 


-50 g hydrophobic adsorber resin per 1 aq. medium of pH 4-7, isolating 


CC 


the adsorbed resin and pro-insulin and desorbing the pro-insulin. This 



CC method produces pro-insulin with correctly bonded Cys bridges . Compared 

CC with known methods it involves fewer stages (esp. no sulphitolysis or 

CC cyanogen bromide cleavage) and overall losses during purification are 

CC reduced, i.e. the process is quicker and gives better yields. Sequences 

CC of insulin chain A, B and C are given in AAR68895-97. Sequences of pro- 

CC insulin 1-4 are given in AAR68 8 98-901 . (Updated on 25-MAR-2003 to correct 

CC PN field. ) 
XX 

SQ Sequence 63 AA; 

Query Match 100.0%; Score 294; DB 2; Length 63; 

Best Local Similarity 100.0%; Pred. No. 1.3e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I 
Db 12 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 63 



RESULT 5 
AAR68899 

ID AAR68899 standard; peptide; 96 AA. 
XX 

AC AAR68899; 
XX 

DT 25-MAR-2003 (revised) 

DT 02-MAR-1995 (first entry) 

XX 

DE Human pro-insulin 2 . 
XX 

KW Pro-insulin; A-chain; B-chain; C-chain; disulphide; mercaptan; 

KW chaotropic agent. 

XX 

OS Homo sapiens . 
XX 

PN EP600372-A1. 
XX 

PD 08-JUN-1994. 
XX 

PF 25-NOV-1993; 93EP- 0011 8 993 . 
XX 

PR 02-DEC-1992; 92DE- 042 4 042 0 . 
XX 

PA (FARH ) HOECHST AG. 
XX 

PI Obermeier R, Gerl M, Ludwig J, Sabel W; 
XX 

DR WPI; 1994-177718/22. 
XX 

PT Prodn. of pro-insulin with correct di: sulphide bridges - by treating 

PT recombinant precursor protein with mercaptan in alkali and in presence of 

PT chaotropic agent , then isolation on hydrophobic resin. 

XX 

PS Disclosure; Page 11; 15pp; German. 
XX 

CC Pro-insulin is produced by treating recombinant precursor protein with a 

CC mercaptan to provide 2-10 SH residues per Cys residue, in presence of a 



CC chaotropic agent and in aq. medium of pH 10-11, treating the prod, with 3 

CC -50 g hydrophobic adsorber resin per 1 aq. medium of pH 4-7, isolating 

CC the adsorbed resin and pro-insulin and desorbing the pro-insulin. This 

CC method produces pro-insulin with correctly bonded Cys bridges. Compared 

CC with known methods it involves fewer stages (esp. no sulphitolysis or 

CC cyanogen bromide cleavage) and overall losses during purification are 

CC reduced, i.e. the process is quicker and gives better yields. Sequences 

CC of insulin chain A, B and C are given in AAR68895-97. Sequences of pro- 

CC insulin 1-4 are given in AAR688 98-901 . (Updated on 25-MAR-2003 to correct 

CC PN field. ) 
XX 

SQ Sequence 96 AA; 

Query Match 100.0%; Score 294; DB 2; Length 96; 

Best Local Similarity 100.0%; Pred. No. 2e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I 
Db 45 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 96 



RESULT 6 
AAR78662 

ID AAR78662 standard; protein; 96 AA. 
XX 

AC AAR78662; 
XX 

DT 03-APR-1996 (first entry) 
XX 

DE Fusion protein contg. proinsulin sequence 3. 
XX 

KW Proinsulin; post-translational modification; recombinant production; 

KW protein folding; conformation. 

XX 

OS Synthetic. 
XX 

FH Key Location/Qualifiers 
FT Region 41. .44 

FT /label= R2 

FT /note= "a peptide of 4 amino acids" 

FT Peptide 45. .74 

FT /label= Rl- (B2-B29 ) -Y 

FT /note= "human insulin B-chain" 

FT Region 75 

FT /label= X 

FT Peptide 76. .96 

FT /label= Gly- (A2-A2 0 ) -R3 

FT /note= "human insulin A-chain" 

XX 

PN EP668292-A2 . 
XX 

PD 23-AUG-1995. 
XX 

PF 09-FEB-1995; 95EP-0010174 8 . 
XX 

PR 18-FEB-1994; 94DE-0440517 9 . 



XX 

PA ( FARH ) HOECHST AG. 
XX 

PI Obermeier R, Gerl M, Ludwig J, Sabel W; 
XX 

DR WPI; 1995-284754/38. 
XX 

PT Isolation of insulin that is correctly post-translationally processed - 

PT by reacting pro: insulin with a mercaptan in the presence of a chaotropic 

PT agent and purificn. after absorption to hydrophobic resin. 
XX 

PS Example 2; Page 8; 16pp; German. 
XX 

CC The present sequence is that of a fusion protein, produced in E.coli 

CC which contains an example of a proinsulin molecule corresp. to the 

CC general formula R2-R1- (B2-B29) -Y-X-Gly- (A2-A20) -R3 (II). In formula (II), 

CC X - Lys, Arg or a peptide of 2-35 amino acids contg. Lys or Arg at the N- 

CC and C-termini; Y = a natural amino acid; Rl = Phe or a bond; R2 = H, Arg, 

CC Lys, a peptide of 2-45 amino acids contg. Arg or Lys at the N- and C- 

CC termini; R3 = a natural amino acid; (A2-A20) and (B2-B29) are the insulin 

CC A- and B-chain sequences from human or other insulin. The proinsulin 

CC molecule, released by cyanogen bromide, is reacted with mercaptan at a 

CC ratio of 2-10 SH residues of mercaptan per Cys residue of proinsulin. The 

CC reaction takes place in the presence of a chaotropic auxiliary agent at 

CC pH 10-11 and results in proinsulin with correctly linked cystine bridges. 

CC Reaction with trypsin and opt. carboxypeptidase B yields correctly folded 

CC insulin. The insulin is isolated by absortion on a hydrophobic resin 

XX 

SQ Sequence 96 AA; 

Query Match 100.0%; Score 294; DB 2; Length 96; 

Best Local Similarity 100.0%; Pred. No. 2e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 45 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 96 



RESULT 7 
AAY42860 

ID AAY42860 standard; protein; 107 AA. 
XX 

AC AAY42860; 
XX 

DT 19-JAN-2000 (first entry) 
XX 

DE hGH-mini-proinsulin chimeric protein. 
XX 

KW Insulin; precursor; growth hormone; chaperone; intramolecular; folding; 

KW conformation; chimeric protein; cleavable; recombinant; production; 

KW yield. 
XX 

OS Synthetic. 

OS Homo sapiens. 
XX 

PN WO9950302-A1. 



XX 

PD 07-OCT-1999. 
XX 

PF 31-MAR-1998; 98WO-CN000052 . 
XX 

PR 31-MAR-1998; 98WO-CN000052 . 
XX 

PA (TONG-) TONGHUA GANTECH BIOTECHNOLOGY LTD. 
XX 

PI Gan Z; 
XX 

DR WPI; 1999-610839/52. 
XX 

PT New chimeric proteins containing human growth hormone fragment, used 

PT particularly for the production of human insulin. 

XX 

PS Claim 13; Page 30; 46pp; English. 
XX 

CC This sequence represents a chimeric protein, hGH-mini-proinsulin . This 

CC chimeric protein contains an N-terminal fragment of human growth hormone 

CC (hGH) of the sequence given in AAY42855, a cleavable peptide linker 

CC (AAY42857), and a human insulin precursor comprising insulin A and B 

CC chains (AAY42859) . The hGH portion of the chimeric protein acts as an 

CC intramolecular chaperone (IMC) for the insulin precursor, enabling it to 

CC fold correctly. The cleavable peptide linker has a C-terminal Arg residue 

CC which enables the hGH portion of the chimeric protein to be removed after 

CC folding has taken place. Production of recombinant human insulin via an 

CC hGH-proinsulin chimeric protein can provide human insulin with correctly 

CC linked cysteine bridges with fewer necessary procedural steps, and hence 

CC resulting in a higher yield of human insulin. The IMC sequences not only 

CC protect insulin sequences from intracellular degradation by a 

CC microorganism host, but also promote the folding of the fused insulin 

CC precursor, facilitate the solubility of the fusion protein and decrease 

CC the intermolecular interactions among the fusion proteins, thus allowing 

CC folding of the fused insulin precursor at commercially useful high 

CC concentrations. The procedural steps of cyanogen bromide cleavage, 

CC oxidative sulphitolysis and related purification steps can thus be 

CC eliminated, along with the use of high concentrations of mercaptan or the 

CC use of hydrophobic absorbent resins 

XX 

SQ Sequence 107 AA; 

Query Match 100.0%; Score 294; DB 2; Length 107; 
Best Local Similarity 100.0%; Pred. No. 2.2e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I II I I I I I I I I I I I I I II II I I M M I I I II I I I I I I II I I I I I I I I I I I I I 

Db 56 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 107 



RESULT 8 
AAR98897 

ID AAR98897 standard; protein; 116 AA. 
XX 

AC AAR98897; 
XX 



DT 03-FEB-1997 (first entry) 
XX 

DE SOD-proinsulin hybrid polypeptide. 
XX 

KW Insulin; proinsulin; hybrid polypeptide; protein folding; 

KW enzymatic cleavage; cyanogen bromide; sulphitolysis . 

XX 

OS Homo sapiens. 
XX 

PN WO9620724-A1. 
XX 

PD ll-JUL-1996. 
XX 

PF 29-DEC-1994; 94WO-US 0 132 68 . 
XX 

PR 29-DEC-1994; 94WO-US0132 68 . 
XX 

PA (BIOT-) BIO-TECHNOLOGY GENERAL CORP. 
XX 

PI Hartman JR, Mendelovitz S, Gorecki M; 
XX 

DR WPI; 1996-333766/33. 

DR N-PSDB; AAT34670. 
XX 

PT Recombinant insulin prodn. by correctly folding pro-insulin hybrid 

PT polypeptide - then enzymatic cleavage of folded product, does not require 

PT sulphite protection of SH nor use of cyanogen bromide. 

XX 

PS Example IB; Fig 7; 69pp; English. 
XX 

CC A new method for the production of recombinant human insulin comprises 

CC folding a hybrid polypeptide comprising proinsulin under conditions that 

CC permit correct disulphide bond formation and subjecting that folded 

CC protein to enzymatic cleavage. The insulin produced can then be purified. 

CC This sequence is a SOD-insulin B chain-Arg-insulin A chain hybrid 

CC polypeptide and is encoded by the plasmid construct pDBAST-LAT. 

CC Transformation of the proper E.coli host cells with pDBAST-LAT results in 

CC the efficient expression of the proinsulin hybrid polypeptide, useful for 

CC human insulin production. The method produces recombinant human insulin 

CC identical to the natural hormone. Hazardous and cumbersome procedures 

CC involving cyanogen bromide and sulphitolysis to protect SH groups are 

CC avoided since the entire hybrid polypeptide folds efficiently to the 

CC native structure even with the leader attached and Cys unprotected 

XX 

SQ Sequence 116 AA; 

Query Match 100.0%; Score 294; DB 2; Length 116; 
Best Local Similarity 100.0%; Pred. No. 2.4e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 65 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 116 



RESULT 9 
AAR71692 



ID AAR71692 standard; protein; 137 AA. 
XX 

AC AAR71692; 
XX 

DT 25-MAR-2003 (revised) 

DT 20-NOV-1995 (first entry) 

XX 

DE Mating factor alpha 1-Insulin precursor ArgB31. 
XX 

KW Human insulin precursor ArgB31; diabetes; Zinc ion complex; 

KW mating factor alpha 1. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Protein 1. .85 

FT /labels mating factor alpha-1 

FT Peptide 86. .116 

FT /label= B-chain 

FT Peptide 117. .137 

FT /label= A-chain 

XX 

PN WO9507931-A1. 
XX 

PD 23-MAR-1995. 
XX 

PF 16-SEP-1994; 94WO-DK000347 . 
XX 

PR 17-SEP-1993; 93DK- 0 00 0 104 4 . 

PR 02-FEB-1994; 94US-0019082 9 . 
XX 

PA (NOVO ) NOVO-NORDISK AS. 
XX 

PI Havelund S, Halstrom JB, Jonassen I, Andersen AS, Markussen J; 
XX 

DR WPI; 1995-131314/17. 

DR N-PSDB; AAQ86425. 
XX 

PT Acylated insulin deriv. which may be present as a Zinc ion complex - is 

PT used to treat diabetes and is rapid acting. 

XX 

PS Example 5; Page 78; lOOpp; English. 
XX 

CC AAQ86425 encodes AAR71692 mating factor alpha 1-Insulin precursor ArgB31. 

CC ArgB31 comprises the B and A chains of a claimed human insulin 

CC derivative. In the final claimed compsn. they are covalently connected 

CC via disulphide bonds between Cys residues A7/B7 and A20/B19. The 

CC derivative, which may be present as a zinc ion complex, can be used as a 

CC fast action treatment for diabetes. (Updated on 25-MAR-2003 to correct PN 

CC field.) 

XX 

SQ Sequence 137 AA; 



Query Match 100.0%; Score 294; DB 2; Length 137; 

Best Local Similarity 100.0%; Pred. No. 2.8e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

Db 8 6 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 137 



RESULT 10 
AAR71694 

ID AAR71694 standard; protein; 145 AA. 
XX 

AC AAR71694; 
XX 

DT 25-MAR-2003 (revised) 

DT 20-NOV-1995 (first entry) 

XX 

DE Mating factor alpha 1-Insulin precursor ArgBl, ArgB31 N-terminal. 
XX 

KW Human insulin precursor ArgBl, ArgB31; diabetes; Zinc ion complex; 

KW mating factor alpha 1; N-terminal EEAEAEAR. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Quali f iers 

FT Protein 1. .85 

FT /label= mating factor alpha-1 

FT Peptide 86. .93 

FT /label= N-terminal peptide 

FT Peptide 94. .124 

FT /label= B-chain 

FT Peptide 125. .145 

FT /label= A-chain 

XX 

PN WO9507931-A1 . 
XX 

PD 23-MAR-1995. 
XX 

PF 16-SEP-1994; 94WO-DK000347 . 
XX 

PR 17-SEP-1993; 93DK-00001044 . 

PR 02-FEB-1994; 94US-0019082 9 . 
XX 

PA (NOVO ) NOVO-NORDISK AS. 
XX 

PI Havelund S, Halstrom JB, Jonassen I, Andersen AS, Markussen J; 
XX 

DR WPI; 1995-131314/17. 

DR N-PSDB; AAQ86429. 
XX 

PT Acylated insulin deriv. which may be present as a Zinc ion complex - is 

PT used to treat diabetes and is rapid acting. 

XX 

PS Example 5; Page 82-83; lOOpp; English. 
XX 

CC AAQ86429 encodes AAR71694 mating factor alpha 1-Insulin precursor ArgBl, 

CC ArgB31 N-terminal EEAEAEAR. The insulin precursor comprises the B and A 

CC chains of a claimed human insulin derivative preceded by the N-terminal 

CC amino acids EEAEAEAR. In the final claimed compsn. they are covalently 

CC connected via disulphide bonds between Cys residues A7/B7 and A20/B19. 



CC The derivative, which may be present as a zinc ion complex, can be used 

CC as a fast action treatment for diabetes. (Updated on 25-MAR-2003 to 

CC correct PN field.) 
XX 

SQ Sequence 145 AA; 

Query Match 100.0%; Score 294; DB 2; Length 145; 

Best Local Similarity 100.0%; Pred. No. 3e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps U; 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I M | | I 1 I I I M I I I I II I II I I I I I 1 I I I I I I M I I I I I I I M I II I I I I 

Db 94 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 145 



AAR71695; 

25-MAR-2 0 03 (revised) 
20-NOV-1995 (first entry) 



Mating factor alpha 1-Insulin precursor ArgBl, ArgB31 N-terminal. 

Human insulin precursor ArgBl, ArgB31; diabetes; Zinc ion complex; 
mating factor alpha 1; N-terminal EEAEAEAER. 



RESULT 11 
AAR71695 

ID AAR71695 standard; protein; 146 AA. 
XX 
AC 
XX 
DT 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 



Homo sapiens . 
Key 

Protein 
Peptide 
Peptide 
Peptide 

WO9507931-A1, 
23-MAR-1995. 

16- SEP-1994; 

17- SEP-1993; 
02-FEB-1994; 



Location/ Qualifiers 
1. .85 

/label= mating factor alpha-1 
86. .94 

/label= N-terminal peptide 
95. .125 
/label= B-chain 
126. .146 
/label= A-chain 



94WO-DK000347 . 

93DK-00001044 . 
94US-00190829. 



(NOVO ) NOVO-NORDISK AS. 

Havelund S, Halstrom JB, Jonassen I, Andersen AS, Markussen J; 

WPI; 1995-131314/17. 
N-PSDB; AAQ86432. 



PT Acylated insulin deriv. which may be present as a Zinc ion complex - is 

PT used to treat diabetes and is rapid acting. 

XX 

PS Example 6; Page 85; lOOpp; English. 
XX 

CC AAQ86432 encodes AAR71695 mating factor alpha 1-Insulin precursor ArgBl, 

CC ArgB31 N-terminal EEAEAEAER. The insulin precursor comprises the B and A 

CC chains of a claimed human insulin derivative preceded by the N-terminal 

CC amino acids EEAEAEAER . In the final claimed compsn. they are covalently 

CC connected via disulphide bonds between Cys residues A7/B7 and A20/B19. 

CC The derivative, which may be present as a zinc ion complex, can be used 

CC as a fast action treatment for diabetes. (Updated on 25-MAR-2003 to 

CC correct PN field.) 
XX 

SQ Sequence 146 AA; 

Query Match 100.0%; Score 294; DB 2; Length 146; 

Best Local Similarity 100.0%; Pred. No. 3e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | | I M I I I I I II I I I I I I I I I I I M M I I I M I I I I I I I I I I I I I I I M I I 
D b 95 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 14 6 



RESULT 12 
AAY42861 

ID AAY42861 standard; protein; 150 AA. 
XX 

AC AAY42 8 61; 
XX 

DT 19-JAN-2000 (first entry) 
XX 

DE Chimeric protein, SEQ ID 7. 
XX 

KW Insulin; precursor; growth hormone; chaperone; intramolecular; folding; 
KW conformation; chimeric protein; cleavable; recombinant; production; 
KW yield. 
XX 

OS Synthetic. 
OS Homo sapiens. 
XX 

PN WO9950302-A1. 
XX 

PD 07-OCT-1999. 
XX 

PF 31-MAR-1998; 98WO-CN000052 . 
XX 

PR 31-MAR-1998; 9 8WO-CN000052 . 
XX 

PA (TONG-) TONGHUA GANTECH BIOTECHNOLOGY LTD. 
XX 

PI Gan Z; 
XX 

DR WPI; 1999-610839/52. 
XX 

PT New chimeric proteins containing human growth hormone fragment, used 



PT particularly for the production of human insulin. 
XX 

PS Claim 14; Page 30-31; 46pp; English. 

XX . 

CC This sequence represents a chimeric protein, which contains an N-termmal 

CC fragment of human growth hormone (hGH) of the sequence given in AAY42856, 

CC a cleavable peptide linker (AAY42857), and a human insulin precursor 

CC comprising insulin A and B chains (AAY42859) . The hGH portion of the 

CC chimeric protein acts as an intramolecular chaperone (IMC) for the 

CC insulin precursor, enabling it to fold correctly. The cleavable peptide 

CC linker has a C-terminal Arg residue which enables the hGH portion of the 

CC chimeric protein to be removed after folding has taken place. Production 

CC of recombinant human insulin via an hGH-proinsulin chimeric protein can 

CC provide human insulin with correctly linked cysteine bridges with fewer 

CC necessary procedural steps, and hence resulting in a higher yield of 

CC human insulin. The IMC sequences not only protect insulin sequences from 

CC intracellular degradation by a microorganism host, but also promote the 

CC folding of the fused insulin precursor, facilitate the solubility of the 

CC fusion protein and decrease the intermolecular interactions among the 

CC fusion proteins, thus allowing folding of the fused insulin precursor at 

CC commercially useful high concentrations. The procedural steps of cyanogen 

CC bromide cleavage, oxidative sulphitolysis and related purification steps 

CC can thus be eliminated, along with the use of high concentrations of 

CC mercaptan or the use of hydrophobic absorbent resins 

XX 

SQ Sequence 150 AA; 

Query Match 100.0%; Score 294; DB 2; Length 150; 

Best Local Similarity 100.0%; Pred. No. 3.1e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | | | I I I I I I II I I II I II I I I I I I I I I I I 1 I I I M M II I I I I I I I I I I I I 
Db 99 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 150 



RESULT 13 
AAR04582 

ID AAR04582 standard; protein; 57 AA. 
XX 

AC AAR04582; 
XX 

DT 25-MAR-2003 (revised) 

DT 14-SEP-1990 (first entry) 



XX 
DE 
XX 
KW 



Proinsulin analogue with a Lys residue linking the A and B chains 



insulin fusion protein; pro-insulin analogue; tendamistate ; 

KW Lys-Lys bridge; ds . 
XX 

OS Synthetic. 
XX 

FH Key Location/Qualifiers 

FT Peptide 1. -35 

FT /label= Insulin B chain 

FT Misc-dif f erence 36 

FT /label= Lys residue linking insulin B chain to A chain 



FT Peptide 37. .57 

FT /label= Insulin A chain 
XX 

PN EP367163-A. 
XX 

PD 09-MAY-1990. 
XX 

PF 28-OCT-1989; 89EP-0012 0056 . 
XX 

PR 03-NOV-1988; 88DE-03837273 . 

PR 19-AUG-1989; 89DE-0392744 9 . 
XX 

PA (FARH ) HOECHST AG. 
XX 

PI Koller KP, Riess GJ, Uhlmann E f Wallmeier H; 
XX 

DR WPI; 1990-141149/19. 

DR N-PSDB; AAQ04335. 
XX 

PT New insulin fusion proteins - comprise pro-insulin analogue linked to 

PT tendami state . 

XX 

PS Disclosure; Page ?; -pp; German. 
XX 

CC This sequence is joined to the C-terminus of an N-terminal fragment 

CC comprising opt. modified tendamistate . This fusion protein may be 

CC converted into human insulin using known methods. The synthetic gene was 

CC prepared by the phosphoramidite method. See also AAQ04336. (Updated on 25 

CC -MAR-2003 to correct PR field.) (Updated on 25-MAR-2003 to correct PI 

CC field.) 

XX 

SQ Sequence 57 AA; 

Query Match 99.0%; Score 291; DB 2; Length 57; 
Best Local Similarity 98.1%; Pred. No. 2.6e-26; 

Matches 51; Conservative 1; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| M I I I I I I I I I I I I M II I II I I I I I I I I : I i I I I I I I i M I II I I I I II I 

Db 6 FVNQHLCGSHLVEALYLVCGERGFFYTPKTKGIVEQCCTSICSLYQLENYCN 57 



RESULT 14 


AAR11899 


ID 


AAR11899 standard; protein; 52 AA. 


XX 




AC 


AAR11899; 


XX 




DT 


25-MAR-2 0 03 (revised) 


DT 


22-JUL-1991 (first entry) 


XX 




DE 


Example of human insulin precursor. 


XX 




KW 


Human insulin; diabetes; transpeptidation . 


XX 




OS 


Homo sapiens . 


XX 





PN EP427296-A. 
XX 

PD 15-MAY-1991. 
XX 

PF 29-MAY-1985; 90EP-001218 87 . 
XX 

PR 30-MAY-1984; 84DK-00002665 . 

PR 08-FEB-1985; 85DK-00000582 . 
XX 

PA (NOVO ) NOVO-NORDISK AS. 
XX 

PI Markussen J, Fiil N, Ammerer G, Hansen MT, Thim L, Norris K; 

PI Voigt HO; 

XX 

DR WPI; 1991-141828/20. 
XX 

PT Human insulin precursors - expressed with correctly positioned 

PT di: sulphide bridges giving improved resistance to proteolysis. 
XX 

PS Claim 3; Page 18; 28pp; English. 
XX 

CC This human insulin precursor has correctly positioned disulphide bridges 

CC between the A and B chains and is more resistant to proteolytic digestion 

CC than prior art insulin precursors. Yeast strains transformed with DNA 

CC encoding this precursor can be cultured to secrete it in high yields . The 

CC precursor can be converted into mature human insulin by transpeptidation . 

CC See also AAR11897-98. (Updated on 25-MAR-2003 to correct PF field.) 

CC (Updated on 25-MAR-2003 to correct PA field.) 
XX 

SQ Sequence 52 AA; 

Query Match 97.6%; Score 2 87; DB 2; Length 52; 
Best Local Similarity 96.2%; Pred. No. 6.8e-26; 

Matches 50; Conservative 2; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | M | I I I I I I I I I I I I I I I I I II I I I I I I I M II I I I I I I I I I I I I I i I 

Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKSKGIVEQCCTS I CSLYQLENYCN 52 



RESULT 15 




AAR65 


.883 




ID 


AAR65883 standard; protein; 53 AA. 


XX 






AC 


AAR65883; 




XX 






DT 


16-OCT-2003 


( revised) 


DT 


25-MAR-2003 


( revised) 


DT 


26-JUN-1995 


(first entry) 


XX 






DE 


Di-Arg- (B31-32) -Human insulin amorphous, monospherical deriv. 


XX 




; recombinant production; amorphous; monospherical form; 


KW 


Human insulin 


KW 


diabetes mellitus . 


XX 




(produced recombinantly in Escherichia coli) . 


OS 


Homo sapiens; 


XX 







FH 
FT 
FT 
FT 
FT 



Key 

Protein 



Protein 



Location/Qualifiers 
1. .30 

/label= insulin_B-chain 
33. .53 

/label= insulin_A-chain 



XX 

PN EP622376-A1. 
XX 

PD 02-NOV-1994. 
XX 

PF 21-APR-1994; 94EP- 0 0 1061 96 . 
XX 

PR 27-APR-1993; 93DE-04 3137 02 . 
XX 

PA (FARH ) HOECHST AG. 
XX 

PI Obermeier R, Sabel W, Deil P, Geisen K; 
XX 

DR WPI; 1994-334579/42. 
XX 

PT Amorphous, mono-spherical form of insulin derivs . - for treating diabete 

PT mellitus, are produced by diluting soln. in aq. isopropanol, are stable 

PT when dried or in suspension. 
XX 

PS Example 2; Page 5; lOpp; German. 
XX 

CC This sequence is a specific example of an insulin derivative which can b 

CC obtained in amorphous, monospherical form by dissolving in an n- 

CC propanol/buf fer mixture (pH 4.5-6.5) having n-propanol content 15% 

CC relative to water. The solution is then diluted with water to reduce n- 

CC propanol content to below 15%. The resulting insulin preparation is 

CC stable and can be used for the treatment of diabetes mellitus. (Updated 

CC on 25-MAR-2003 to correct PN field.) (Updated on 16-OCT-2003 to 

CC standardise OS field) 

XX 

SQ Sequence 53 AA; 

Query Natch 96.4%; Score 283.5; DB 2; Length 53; 
Best Local Similarity 98.1%; Pred. No. 1.8e-25; 

Matches 52; Conservative 0; Mismatches 0; Indels 1; Gaps 



Qy 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-RGIVEQCCTSICSLYQLENYCN 52 





Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRRGIVEQCCTSICSLYQLENYCN 53 



Search completed: July 15, 2004, 16:35:34 
Job time : 28.1642 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: July 15, 2004, 16:30:45 ; Search time 7.85821 Seconds 

(without alignments) 
341.624 Million cell updates/sec 

Title: US-09-423- 100-5 

Perfect score: 294 

Sequence: 1 FVNQHLCGSHLVEALYLVCG IVEQCCTSICSLYQLENYCN 52 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 389414 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



Database : Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB,pep: * 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB . pep : * 

6 : /cgn2_6/ptodata/2/iaa/backf ilesl .pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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100. 


0 


145 


3 


us- 


08 


-975- 
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08- 


975- 
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Sequence 
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3, Appli 


18 


278 


.5 


94. 


7 


51 


4 


us- 


09- 


723- 


896-3 


Sequence 


3, Appli 


19 


277 


.5 


94. 


4 


53 


1 


us- 


08- 


233- 


617-3 


Sequence 


3, Appli 


20 


277 


94. 


2 


65 


3 


us- 


08- 


900- 


574-3 


Sequence 


3, Appli 


21 


276 


.5 


94. 


0 


55 


3 


us- 


08- 


900- 


574-6 


Sequence 


6, Appli 


22 


276 


. 5 


94. 


0 


66 


3 


us- 


08- 


900- 


574-5 


Sequence 


5, Appli 


23 


276 


. 5 


94 . 


0 


67 


4 


us- 


08- 


981- 


988A-1 


Sequence 


1, Appli 


24 


276 


.5 


94. 


0 


67 


4 


us- 


08- 


981- 


98 8A-5 


Sequence 


5, Appli 


25 


276 


93. 


9 


67 


3 


us- 


08- 


900- 


574-7 


Sequence 


7, Appli 


26 


275 


. 5 


93. 


7 


53 


3 


us- 


09- 


261- 


853-2 


Sequence 


2, Appli 


27 


275 


.5 


93. 


7 


65 


1 


us- 


08- 


468- 


674B-71 


Sequence 


71, Appl 


28 


275 


. 5 


93. 


7 


65 


1 


us- 


08- 


780- 


571-71 


Sequence 


71, Appl 


29 


275 


.5 


93. 


7 


89 


1 


us- 


08- 


468- 


674B-41 


Sequence 


41, Appl 


30 


275 


.5 


93. 


7 


89 


1 


us- 


08- 


780- 


571-41 


Sequence 


41, Appl 


31 


275 


.5 


93. 


7 


91 


1 


us- 


08- 


468- 


674B-45 


Sequence 


45, Appl 


32 


275 


.5 


93. 


7 


91 


1 


us- 


08- 


780- 


571-45 


Sequence 


45, Appl 


33 


275 


.5 


93. 


7 


104 


1 


us- 


08- 


400- 


256-15 


Sequence 


15, Appl 


34 


275 


.5 


93. 


7 


104 


3 


us- 


08- 


975- 


365-15 


Sequence 


15, Appl 


35 


275 


.5 


93. 


7 


117 


3 


us- 


09- 


012- 


669F-37 


Sequence 


37, Appl 


36 


275 


.5 


93. 


7 


124 


1 


us- 


08- 


446- 


646-3 


Sequence 


3, Appli 


37 


275 


. 5 


93. 


7 


124 


3 


us- 


09- 


012- 


669F-36 


Sequence 


36, Appl 


38 


275 


.5 


93. 


7 


138 


3 


us- 


08- 


932- 


082-19 


Sequence 


19, Appl 


39 


275 


.5 


93. 


7 


138 


4 


us- 


09- 


861- 


687-19 


Sequence 


19, Appl 


40 


275 


. 5 


93. 


7 


140 


1 


us- 


08- 


400- 


256-33 


Sequence 


33, Appl 


41 


275 


.5 


93. 


7 


140 


1 


us- 


08- 


400- 


256-42 


Sequence 


42, Appl 


42 


275 


.5 


93. 


7 


140 


3 


us- 


08- 


975- 


365-33 


Sequence 


33, Appl 


43 


275 


.5 


93. 


7 


140 


3 


us- 


08- 


975- 


365-42 


Sequence 


42, Appl 


44 


273 


.5 


93. 


0 


67 


4 


us- 


08- 


981- 


98 8A-2 


Sequence 


2, Appli 


45 


272 


.5 


92. 


7 


53 


3 


us- 


08- 


900- 


574-4 


Sequence 


4, Appli 



ALIGNMENTS 



RESULT 1 

US-08-160-376A-7 

Sequence 1 , Application US/08160376A 
Patent No. 5473049 
GENERAL INFORMATION: 

APPLICANT: Obermeier, Ranier 
APPLICANT : Gerl, Martin 
APPLICANT: Ludwig, Jurgen 
APPLICANT: Sabel, Walter 

TITLE OF INVENTION: Process For Obtaining Proinsulin 
TITLE OF INVENTION: Possessing Correctly Linked 
TITLE OF INVENTION: Cystine Bridges 
NUMBER OF SEQUENCES: 7 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Kenneth A. Genoni, Esq. 
STREET: Rt . 202-206 No. 5473049th/P . O . Box 2500 
CITY: Somerville 
STATE: New Jersey 



; COUNTRY: U.S.A. 

; ZIP: 08876-1258 

; COMPUTER READABLE FORM: 

; MEDIUM TYPE: DISKETTE, 3.5 INCH, 1.44 Mb STORAGE 

; COMPUTER: IBM 38 6 

OPERATING SYSTEM: WINDOWS 3.1 
; SOFTWARE : WORDPERFECT 5.1 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08 / 160 , 37 6A 
; FILING DATE: December 1, 1993 

; CLASSIFICATION: 530 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: GE P 4240420.7 
; FILING DATE: December 2, 1992 

ATTORNEY/AGENT INFORMATION: 
; NAME: Barbara V. Maurer, Esq. 

; REGISTRATION NUMBER: 31,2 87 

; REFERENCE/ DOCKET NUMBER: HOE 92/ F 384 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (908) 231-4079 

TELEFAX: (908) 231-2255 
; INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 56 Amino Acids 
; TYPE: Amino Acid (AA) 

; TOPOLOGY: not relevant 

US-08-160-376A-7 



Query Match 100.0%; Score 294; DB 1; Length 56; 

Best Local Similarity 100.0%; Pred. No. 7.3e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I 1 II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I II I I I 
> 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 56 



RESULT 2 

US-08-389-487-11 

Sequence 11, Application US/08389487 
Patent No. 5663291 
GENERAL INFORMATION: 

APPLICANT: Obermeier, Rainer 
APPLICANT: Gerl, Martin 
APPLICANT: Ludwig, Jurgen 
APPLICANT: Sabel, Walter 

TITLE OF INVENTION: Process for Obtaining Insulin Having 
TITLE OF INVENTION: Correctly Linked Cystine Bridges 
NUMBER OF SEQUENCES: 12 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 
ADDRESSEE: Dunner 
STREET: 1300 I Street, N.W. 
CITY: Washington 
STATE: D.C. 

COUNTRY: United States of America 
ZIP: 20005-3315 



COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/38 9, 487 

; FILING DATE: 

; CLASSIFICATION: 530 

; ATTORNEY/AGENT INFORMATION: 

NAME: Einaudi, Carol P. 

REGISTRATION NUMBER: 32,220 

REFERENCE/ DOCKET NUMBER: 024 81.1424-00000 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 202-408-4000 

; TELEFAX: 202-408-4400 

; INFORMATION FOR SEQ ID NO: 11: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 56 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
; MOLECULE TYPE: peptide 
US-08-389-487-11 

Query Match 100.0%; Score 294; DB 1; Length 56; 

Best Local Similarity 100.0%; Pred. No. 7.3e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I 
Db 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 56 



RESULT 3 

US-08-160-376A-6 

Sequence 6, Application US/08160376A 
Patent No. 5473049 
GENERAL INFORMATION: 

APPLICANT: Obermeier, Ranier 
APPLICANT: Gerl, Martin 
APPLICANT: Ludwig, Jurgen 
APPLICANT: Sabel, Walter 

TITLE OF INVENTION: Process For Obtaining Proinsulin 
TITLE OF INVENTION: Possessing Correctly Linked 
TITLE OF INVENTION: Cystine Bridges 
NUMBER OF SEQUENCES: 7 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Kenneth A. Genoni, Esq. 
STREET: Rt . 202-206 No. 5473049th/ P . O. Box 2500 
CITY: Somerville 
STATE: New Jersey 
COUNTRY: U.S.A. 
ZIP: 08876-1258 
COMPUTER READABLE FORM: 

MEDIUM TYPE: DISKETTE, 3.5 INCH, 1.44 Mb STORAGE 
COMPUTER: IBM 386 



OPERATING SYSTEM: WINDOWS 3.1 
; SOFTWARE: WORDPERFECT 5.1 

; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/ 08/160, 376A 

FILING DATE: December 1, 1993 

CLASSIFICATION: 530 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: GE P 4240420.7 

FILING DATE: December 2, 1992 
ATTORNEY/AGENT INFORMATION: 
; NAME: Barbara V. Maurer, Esq. 

REGISTRATION NUMBER: 31,287 

REFERENCE/ DOCKET NUMBER: HOE 92/F 384 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (908) 231-4079 
; TELEFAX: (908) 231-2255 

; INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 63 Amino Acids 
; TYPE: Amino Acid (AA) 

TOPOLOGY: not relevant 
US-08-160-376A-6 

Query Match 100.0%; Score 2 94; DB 1; Length 63; 

Best Local Similarity 100.0%; Pred. No. 8.3e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I II I I I I I I I M I I 

Db 12 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 63 



RESULT 4 

US-08-291-060B-5 

Sequence 5, Application US/08291060B 
Patent No. 5728543 
GENERAL INFORMATION: 

APPLICANT: Dorschug, Michael 
APPLICANT: Roller, Klaus-Peter 
APPLICANT: Marquardt, Rudiger 
APPLICANT: Meiwes, Johannes 

TITLE OF INVENTION: An Enzymatic Process for the 
TITLE OF INVENTION: Conversion of Preproinsulins Into Insulins 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 
ADDRESSEE: Dunner, L.L.P. 
STREET: 1300 I Street, N.W. 
CITY: Washington 
STATE: D.C. 
COUNTRY: USA 
ZIP: 20005-3315 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 



CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/291 , 060B 

FILING DATE: 08-AUG-1994 
; CLASSIFICATION: 435 

ATTORNEY/AGENT INFORMATION: 

NAME: Einaudi, Carol P. 

REGISTRATION NUMBER: 32,22 0 

REFERENCE/DOCKET NUMBER: 02481.1105-02000 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: (202) 408-4366 

TELEFAX: (202) 408-4400 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 66 amino acids 

; TYPE: amino acid 

STRANDEDNESS : single 
; TOPOLOGY: linear 

; MOLECULE TYPE: peptide 
US-08-291-060B-5 



Query Match 100.0%; Score 294; DB 1; Length 66; 

Best Local Similarity 100.0%; Pred. No. 8.8e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I 1 I I M I I I I I I I I I I I I I I I I I I I I II M I I I I I II I I I I II I I I I I 1 I I I 
Db 15 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 66 



RESULT 5 

US-08-160-376A-5 

; Sequence 5, Application US/08160376A 
; Patent No. 5473049 

GENERAL INFORMATION: 
; APPLICANT: Obermeier, Ranier 

APPLICANT: Gerl, Martin 
; APPLICANT: Ludwig, Jurgen 
; APPLICANT: Sabel, Walter 

; TITLE OF INVENTION: Process For Obtaining Proinsulin 

TITLE OF INVENTION: Possessing Correctly Linked 
; TITLE OF INVENTION: Cystine Bridges 
NUMBER OF SEQUENCES: 7 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Kenneth A. Genoni, Esq. 

STREET: Rt . 202-206 No. 54 7304 9th/ P . O . Box 2500 
; CITY: Somerville 

; STATE: New Jersey 

COUNTRY: U.S.A. 

ZIP: 08876-1258 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: DISKETTE, 3.5 INCH, 1.44 Mb STORAGE 

COMPUTER: IBM 38 6 
; OPERATING SYSTEM: WINDOWS 3.1 

SOFTWARE: WORDPERFECT 5.1 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/ 08/ 160, 37 6A 

FILING DATE: December 1, 1993 



CLASSIFICATION: 530 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: GE P 4240420.7 

FILING DATE: December 2, 1992 
; ATTORNEY / AGENT INFORMATION: 

; NAME: Barbara V. Maurer, Esq. 

; REGISTRATION NUMBER: 31,287 

REFERENCE/ DOCKET NUMBER: HOE 92/F 384 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (908) 231-4079 

; TELEFAX: (908) 231-2255 

; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 96 Amino Acids 

; TYPE: Amino Acid (AA) 

; TOPOLOGY: not relevant 

US-08-160-376A-5 

Query Match 100.0%; Score 294; DB 1; Length 96; 

Best Local Similarity 100.0%; Pred. No. 1.3e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I | I I I I I I I I I I I I I II I 1 I I I I I M I I I I I I I I I i I I M I II I II I I II I I 
Db 45 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 96 



RESULT 6 
US-08-389-487-8 

; Sequence 8, Application US/08389487 

; Patent No. 5663291 

; GENERAL INFORMATION: 

; APPLICANT: Obermeier, Rainer 

APPLICANT: Gerl, Martin 
; APPLICANT: Ludwig, Jurgen 

APPLICANT: Sabel, Walter 
; TITLE OF INVENTION: Process for Obtaining Insulin Having 

TITLE OF INVENTION: Correctly Linked Cystine Bridges 

NUMBER OF SEQUENCES: 12 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 
; ADDRESSEE: Dunner 

STREET: 1300 I Street, N.W. 
; CITY: Washington 

; STATE : D . C . 

; COUNTRY: United States of America 

; ZIP: 20005-3315 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/389,4 87 

; FILING DATE: 

CLASSIFICATION: 530 
ATTORNEY/ AGENT INFORMATION: 



; NAME: Einaudi, Carol P. 

; REGISTRATION NUMBER: 32,220 

REFERENCE/ DOCKET NUMBER: 02481.1424-00000 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 202-408-4000 

TELEFAX: 202-408-4400 
; INFORMATION FOR SEQ ID NO: 8: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 96 amino acids 

; TYPE: amino acid 

STRANDEDNESS : single 
; TOPOLOGY: linear 

MOLECULE TYPE: peptide 
US-08-389-487-8 

Query Match 100.0%; Score 294; DB 1; Length 96; 

Best Local Similarity 100.0%; Pred. No. 1.3e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | | I I I I I M I I I I II I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I 
Db 45 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 96 



RESULT 7 

US-08-400-256-39 

Sequence 39, Application US/08400256 
Patent No. 5750497 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT: Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 4 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: No. 5750497o No. 5750497disk of No. 5750497th America, Inc. 
STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP: 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/400, 256 
FILING DATE: 03-MAR-1995 
CLASSIFICATION: 514 
ATTORNEY/AGENT INFORMATION: 
NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 3985.22 0-US 
TELECOMMUNICATION INFORMATION: 



TELEPHONE: 212-867-0123 
TELEFAX: 212-87 8-9655 
INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 137 amino acids 
TYPE: amino acid 
TOPOLOGY : linear 
MOLECULE TYPE: protein 
US-08-400-256-39 

Query Match 100.0%; Score 294; DB 1; Length 137; 

Best Local Similarity 100.0%; Pred. No. 1.9e-28; 

0; Mismatches 0; Indels 0; 



QY 
Db 



Matches 52; Conservative 

1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I M | | | I I I I I I I I I 11 I I I I II I II I M I I I I M I I I M M I M 

8 6 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 



Gaps 



137 



RESULT 8 

US-08-975-365-39 

Sequence 39, Application US/08975365 
Patent No. 6011007 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT: Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 4 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: No. 6011007o No. 6011007disk of No. 6011007th America, 
STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP : 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 975 , 365 
FILING DATE: 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/400,256 
FILING DATE: 03-MAR-1995 
ATTORNEY/AGENT INFORMATION: 
NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/DOCKET NUMBER: 3985.220-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX: 212-878-9655 



; INFORMATION FOR SEQ ID NO: 39: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 137 amino acids 

; TYPE: amino acid 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
US-08-975-365-39 

Query Match 100.0%; Score 294; DB 3; Length 137; 

Best Local Similarity 100.0%; Pred. No. 1.9e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I i I | | I I I I 1 I I I I I I I I I I I I I I I I I I I I I M i M I 1 I 1 M ! M I i M I 

Db 8 6 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 137 



RESULT 9 

US-08-400-256-45 

Sequence 45, Application US/08400256 
Patent No. 5750497 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT: Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 49 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: No. 5750497o No. 5750497disk of No. 5750497th America, Inc. 
STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP : 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 8/ 4 0 0 , 2 56 
FILING DATE: 03-MAR-1995 
CLASSIFICATION: 514 
ATTORNEY /AGENT INFORMATION: 
NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 3985.220-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX: 212-878-9655 
INFORMATION FOR SEQ ID NO: 45: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 145 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 



; MOLECULE TYPE: protein 
US-08-400-256-45 



Query Match 100.0%; Score 294; DB 1; Length 145; 

Best Local Similarity 100.0%; Pred. No. 2e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps U; 
Qv i FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I M I I M I 1 I I I I i I M I I I I I I I I I I M I I I I 

Db 94 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 14 5 



RESULT 10 
US-08-975-365-45 

Sequence 45, Application US/08975365 
Patent No. 6011007 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT: Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 49 
CORRESPONDENCE ADDRESS: 

ADDRESSEE : No. 6011007c No. 6011007disk of No. 6011007th America, Inc. 
STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP: 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/97 5 , 365 
FILING DATE: 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/400,256 
FILING DATE: 03-MAR-1995 
ATTORNEY/AGENT INFORMATION: 
NAME : Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 3985.220-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX : 212-87 8-9655 
INFORMATION FOR SEQ ID NO: 45: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 145 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-975-365-45 



Query Match 100.0%; Score 294; DB 3; Length 145; 

Best Local Similarity 100.0%; Pred. No. 2e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
0v 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKT RGI VEQCCT S I CS LYQLEN YCN 52 
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Db 94 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 145 



RESULT 11 
US-08-400-256-48 

Sequence 48, Application US/08400256 
Patent No. 5750497 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT : Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 49 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: No. 5750497o No. 5750497disk of No. 5750497th America, Inc. 
STREET: 4 05 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP: 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/400, 256 
FILING DATE: 03-MAR-1995 
CLASSIFICATION: 514 
ATTORNEY/AGENT INFORMATION: 
NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 3985. 220 -US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-8 67-0123 
TELEFAX: 212-878-9655 
INFORMATION FOR SEQ ID NO: 48: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 14 6 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-400-256-48 

Query Match 100.0%; Score 294; DB 1; Length 146; 

Best Local Similarity 100.0%; Pred. No. 2e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps u, 



q v i FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 
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Db 95 FVNQ H L C G 3 H L VEAL YLVC GERGFFYTPKTRGI VEQ C CT S I C S L YQ L EN Y CN 14 6 

RESULT 12 
US-08-975-365-48 

Sequence 48, Application US/08975365 
Patent No. 6011007 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
APPLICANT: Halstrom, John 
APPLICANT: Jonassen, lb 
APPLICANT: Andersen, Asser Sloth 
APPLICANT: Markussen, Jan 
TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 49 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: No. 6011007o No. 6011007disk of No. 6011007th America, Inc. 
STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP: 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/975, 365 
FILING DATE: 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/400,256 
FILING DATE: 03-MAR-1995 
ATTORNEY/AGENT INFORMATION: 
NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 3985.220-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX: 212-87 8-9655 
INFORMATION FOR SEQ ID NO: 48: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 14 6 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-975-365-48 

Query Match 100.0%; Score 294; DB 3; Length 146; 

Best Local Similarity 100.0%; Pred. No. 2e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0, 

0v i BVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 
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Db 95 fvnqhLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 146 



RESULT 13 
US-08-030-731A-44 

Sequence 44, Application US/08030731A 
Patent No. 5426036 
GENERAL INFORMATION: 

APPLICANT: Koller, Klaus-Peter 
APPLICANT: Riess, Guenther Johannes 
APPLICANT: Uhlmann, Eugen 
APPLICANT: Wallmeier, Holger 

TITLE OF INVENTION: Processes for the Preparation of Foreign 
TITLE OF INVENTION: Proteins in Streptomycetes 
NUMBER OF SEQUENCES: 48 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 
ADDRESSEE: Dunner 

STREET: 1300 I Street, N.W., Suite 700 
CITY: Washington 
STATE: D.C. 
COUNTRY: USA 
ZIP: 20005-3315 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 030 , 7 31A 
FILING DATE: 12-MAR-1993 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/189,840 
FILING DATE: 03-MAY-1988 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/430,622 
FILING DATE: 01-NOV-198 9 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/687,610 
FILING DATE: 19-APR-1991 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/735,757 
FILING DATE: 29-JUL-1991 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: DE P 37 14 8 66.4 
FILING DATE: 05-MAY-1987 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: DE P 38 37 273.8 
FILING DATE: 03-NOV-1988 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: DE P 39 27 449.7 
FILING DATE: 19-AUG-1989 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: DE P 40 12 818.0 
FILING DATE: 21-APR-1990 
ATTORNEY/AGENT INFORMATION: 



NAME: Kirschner Michael K. 
REGISTRATION NUMBER: 34,851 

REFERENCE/' DOCKET NUMBER: 02481-0533-02000 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 2 02-408-4000 
TELEFAX: 202-408-4400 
INFORMATION FOR SEQ ID NO: 44: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 57 amino acids 
TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: peptide 
US-08-030-731A-44 

Query Match 99.0%; Score 291; DB 1; Length 57; 

Best Local Similarity 98.1%; Pred. No. 1.7e-28; 

Conservative 1; Mismatches 0; Indels 



Matches 



51; 



0; Gaps 



Qy 

Db 



FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 
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RESULT 14 
US-08-233-617-4 

Sequence 4, Application US/08233617 
Patent No. 5466666 
GENERAL INFORMATION: 

APPLICANT: Obermeier, Rainer 
APPLICANT: Sabel, Walter 
APPLICANT: Deil, Peter 
APPLICANT: Geisen, Karl 

TITLE OF INVENTION: Amorphous Monospherical Forms of Insulin 
TITLE OF INVENTION: Derivatives 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 
ADDRESSEE: Dunner 

STREET: 1300 I Street, N.W., Suite 700 
CITY: Washington 
STATE: D.C. 
COUNTRY: USA 
ZIP: 20005-3315 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 8 12 33 , 617 
FILING DATE: 25-APR-1994 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: P 43 13 702.4 
FILING DATE: 27-APR-1993 
ATTORNEY/AGENT INFORMATION: 
NAME: Carol P. Einaudi 



REGISTRATION NUMBER: 32,220 
REFERENCE/ DOCKET NUMBER: 02481.1374-00000 
TELECOMMUNICATION INFORMATION: 
TELEPHONE : 202-408-4000 
TELEFAX : 202-408-4400 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 53 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
ORIGINAL SOURCE: 

ORGANISM: Escherichia coli 
US-08-233-617-4 

Query Match 96.4%; Score 283.5; DB 1; Length 53; 

Best Local Similarity 98.1%; Pred. No. 1.3e-27; 

Matches 52; Conservative 0; Mismatches 0; Indels 1, Gaps 
Ov 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-RGIVEQCCTSICSLYQLENYCN 52 
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Db i FWQHLCGSHLVEALYLVCGERGFFYTPKTRRGIVEQCCTSICSLYQLENYCN 53 



RESULT 15 
US-08-981-988A-42 

Sequence 42, Application US/08981988A 
Patent No. 6337194 
GENERAL INFORMATION: 

APPLICANT: Vittal Mallya Scientific Research Foundation 
APPLICANT: The University of Leicester 
TITLE OF INVENTION: Insulin 
NUMBER OF SEQUENCES: 4 3 

CORRESPONDENCE ADDRESS: , mTrt „ 
ADDRESSEE: VITTAL MALLYA SCIENTIFIC RESEARCH FOUNDATION 

STREET: K. R. ROAD 
CITY: BANGALORE 
COUNTRY: INDIA 
ZIP: 560 004 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/981, 988A 
FILING DATE: 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: GB 9513967.1 
FILING DATE : 08-JUL-1995 
INFORMATION FOR SEQ ID NO: 42: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 53 amino acids 
TYPE: amino acid 
STRANDEDNESS: 
TOPOLOGY: unknown 



US-08-981-988A-42 

Query Match 96.4%; Score 283.5; DB 4; Length 53; 

Best Local Similarity 98.1%; Pred. No. 1.3e-27; 

Matches 52; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

0v i FVNQHLCGSHLVEALYLVCGERGFFYTPKT-RGIVEQCCTSICSLYQLENYCN 52 
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Search completed: July 15, 2004, 16:42:32 
Job time : 8.85821 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



July 15, 2004, 16:29:19 ; Search time 5.8209 Seconds 

(without alignments) 
859.311 Million cell updates/sec 

US-09-423-100-5 
294 

1 FVNQHLCGSHLVEALYLVCG IVEQCCTSICSLYQLENYCN 52 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



283366 



Database 



PIR li 



pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
INWHP 

insulin - sperm whale 

C; Species: Physeter catodon (sperm whale) 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 16-Jul-1999 
C;Accession: A93142; A90082 

R;Ishihara, Y. ; Saito, T.; I to, Y. ; Fujino, M. 
Nature 181, 1468-1469, 1958 

A; Title: Structure of sperm- and sei-whale insulins and their breakdown by whale 
pepsin . 

A; Reference number: A93142 

A; Accession : A93142 

A;Molecule type: protein 

A;Residues: 1-30;31-51 <ISH> 

R;Harris, J.I.; Sanger, F. ; Naughton, M.A. 

Arch. Biochem. Biophys. 65, 427-428, 1956 

A;Title: Species differences in insulin. 

A; Reference number: A90082 

A;Accession: A90082 

A;Molecule type: protein 



A;Residues: 1-30; 31-51 <HAR> 

C; Super family : insulin 

C; Keywords: hormone; pancreas 

F; 1-30 /Domain : insulin chain B #status experimental <BCH> 
F;l-30,31-51/Product: insulin #status experimental <MAT> 
F; 31-51 /Domain : insulin chain A #status experimental <ACH> 
F; 7-37, 19-50, 36-41/Disulfide bonds: #status predicted 

Query Match 93.0%; Score 273.5; DB 1; Length 51; 

Best Local Similarity 96.2%; Pred. No. 1.5e-24; 

Matches 50; Conservative 0; Mismatches 1; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 
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Db 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKA- GI VEQCCT S I CSLYQLENYCN 51 



RESULT 2 
INWHF 

insulin - finback whale (tentative sequence) 

C; Species: Balaenoptera physalus (finback whale, common rorqual) 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 16-Jul-1999 

C;Accession: A91918 

R;Hama, H. ; Titani, K. ; Sakaki, S.; Narita, K. 
J. Biochem. 56, 285-293, 1964 

A; Title: The amino acid sequence in fin-whale insulin. 

A; Reference number: A91918 

A; Access ion: A91918 

A;Molecule type: protein 

A;Residues: 1-30; 31-51 <HAM> 

C; Superf amily : insulin 

C; Keywords: hormone; pancreas 

F; 1-30/Domain: insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F; 3 1-51/ Domain: insulin chain A #status experimental <ACH> 
F; 7-37, 19-50, 36-41/Disulf ide bonds: tstatus predicted 

Query Match 93.0%; Score 273.5; DB 1; Length 51; 

Best Local Similarity 96.2%; Pred. No. 1.5e-24; 

Matches 50; Conservative 0; Mismatches 1; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKTRGIVEQCCTS I CSLYQLENYCN 52 
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Db 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKA-GIVEQCCTS I CSLYQLENYCN 51 



RESULT 3 
INEL 

insulin - elephant 

C; Species: Elephantidae gen. sp. (elephant) 

C;Date: 24-Apr-1984 #sequence_revision 30-Sep-1988 #text_change 16-Jul-1999 
C; Access ion: AO 15 8 4 
R;Smith, L.F. 

Am. J. Med. 40, 662-666, 1966 

A; Title: Species variation in the amino acid sequence of insulin. 
A;Reference number: A90029; MUID : 66160119 ; PMID:5949593 
A; Access ion: A01584 



A;Molecule type: protein 
A;Residues: 1-30;31-51 <SMI> 

A;Note: the species of elephant is not given, but it is most probably the Indi 

elephant (Elephas maximus ) 

C; Superf amily : insulin 

C; Keywords: hormone; pancreas 

F; 1-30 /Domain : insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F; 3 1-51/ Domain : insulin chain A #status experimental <ACH> 
F;7-37, 19-50, 36-4 1/Disulf ide bonds: #status predicted 

Query Match 93.0%; Score 273.5; DB 1; Length 51; 

Best Local Similarity 94.2%; Pred. No. 1.5e-24; 

Matches 49; Conservative 1; Mismatches 1; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II : I I II I I I I I M 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTGVCSLYQLENYCN 51 



RESULT 4 
PC7082 

epidermal growth factor/single chain insulin fusion protein - Bacillus brevis 
(fragment) 

C; Species: Bacillus brevis 

C;Date: 18-Aug-2000 #sequence_revision 18-Aug-2000 #text_change 31-Mar-2003 
C;Accession: PC7082; PC7083 

R;Koh, M. ; Hanagata, H. ; Ebisu, S.; Morihara, K. ; Takagi, H. 
Biosci. Biotechnol. Biochem. 64, 1079-1081, 2000 

A;Title: Use of Bacillus brevis for synthesis and secretion of Des-B30 single- 
chain human insulin precursor. 

A;Reference number: PC7082; MUID : 20335834 ; PMID : 10879487 

A; Accession: PC7 082 

A; Molecule type: DNA 

A; Residues: 1-96 <KOH> 

A; Accession: PC7 083 

A;Molecule type: protein 

A; Residues: 19-28 <K02> 

C; Genetics : 

A; Gene: egf-sci 

C; Superf amily : insulin 

Query Match 92.9%; Score 273; DB 2; Length 96; 

Best Local Similarity 96.2%; Pred. No. 2.9e-24; 

Matches 50; Conservative 0; Mismatches 0; Indels 2; Gaps 1 

Qy 1 FVNQHLCGSHLVEAL YLVCGERGFFYT PKTRGI VEQCCTS I CS LYQLENYCN 52 

I II I I I I I II I I I I I I I I I I I I I I I I I II II I I I I II I I I I II I I I I I I I 
Db 47 FVNQHLCGSHLVEALYLVCGERGFFYTPK — GIVEQCCTSICSLYQLENYCN 96 



RESULT 5 
INHY 

insulin - hamster 

C; Species: Cricetinae gen. sp . (hamster) 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 16-Jul-1999 
C;Accession: A91456 



R;Neelon, F.A. ; Delcher, H.K.; Steinman, H.; Lebovitz, H.E. 
Fed. Proc. 32, 300, 1973 

A; Title: Structure of hamster insulin: comparison with a tumor insulin. 

A; Reference number: A91456 

A; Accession : A914 56 

A;Molecule type: protein 

A;Residues: 1-30; 31-51 <NEE> 

C; Super f amily : insulin 

C; Keywords: hormone; pancreas 

F ; 1-30/ Domain : insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F; 31-51/Domain: insulin chain A #status experimental <ACH> 
F; 7-37, 19-50, 36-41/Disulfide bonds: tstatus predicted 

Query Match 92.3%; Score 271.5; DB 1; Length 51; 

Best Local Similarity 94.2%; Pred. No. 2.5e-24; 

Matches 49; Conservative 2; Mismatches 0; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I : I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKS-GIVDQCCTSICSLYQLENYCN 51 



RESULT 6 
INMSSP 

insulin - Egyptian spiny mouse (tentative sequence) 
C; Species: Acomys cahirinus (Egyptian spiny mouse) 

C;Date: 13-Jul-1981 #sequence_revision 13-Jul-1981 #text_change 31-Mar-2000 
C; Accession: AO 15 91 
R;Buenzli, H.F.; Humbel, R.E. 

Hoppe-Seyler 's Z. Physiol. Chem. 353, 444-450, 1972 

A;Title: Isolation and partial structural analysis of insulin from mouse (Mus 

musculus) and spiny mouse (Acomys cahirinus) . 

A; Reference number: A01591; MUID : 72189454 ; PMID: 5028210 

A; Contents : composition 

A;Accession: A01591 

A;Molecule type: protein 

A; Residues: 1-30; 31-51 <BUE> 

C; Superfamily: insulin 

C; Keywords: hormone; pancreas 

F; 1-30/ Domain: insulin chain B #status predicted <BCH> 
F; 1-30, 31-51/Product : insulin #status predicted <MAT> 
F; 31-51/Domain: insulin chain A #status predicted <ACH> 
F; 7-37, 19-50, 36-41/Disulf ide bonds: #status predicted 

Query Match 91.3%; Score 268.5; DB 1; Length 51; 

Best Local Similarity 92.3%; Pred. No. 5.5e-24; 

Matches 48; Conservative 3; Mismatches 0; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I : I I II I I I I I I I I I I II I I I I I I I I I I : I I I :| I I I I I I I II I I I I I I I 
Db 1 FVBQHLCGSHLVEALYLVCGERGFFYTPKS-GIVDQCCTSICSLYQLENYCN 51 



RESULT 7 
A59151 

insulin precursor - jack bean (fragments) 



N;Alternate names: hypoglycemic agent; plant insulin 
C; Species: Canavalia ensiformis (jack bean) 

C;Date: 07-Dec-1999 #sequence_revision 07-Dec-1999 #text_change 10-Dec-1999 
C;Accession: B59151; A59151 

R;01iveira, A.E.A.; Machado, O.L.T.; Gomes, V.M. ; Xavier-Neto, J.; Pereira, 
A. CP.; Vieira, J.G.H.; Fernandes, K.V.S.; Xavier-Filho, J. 
Protein Pept. Lett. 6, 15-21, 1999 

A;Title: Jack bean seed coat contains a protein with complete sequence homology 

to bovine insulin. 

A; Reference number: A59151 

A;Accession: B59151 

A;Molecule type: protein 

A; Residues: 1-30 <MACB> 

A;Accession: A59151 

A;Molecule type: protein 

A; Residues: 31-51 <MACA> 

C; Comment: The two chains are probably produced from the same precursor. 
C; Superf amily : insulin 

F;l-30, 31-51/Product : insulin tfstatus experimental <MAT> 
F; 1-30 /Domain : chain B #status experimental <CHB> 
F; 3 1-51 /Domain: chain A ((status experimental <CHA> 
F;7-37, 19-50, 36-41/Disulf ide bonds: ((status predicted 

Query Match 91.0%; Score 267.5; DB 2; Length 51; 

Best Local Similarity 92.3%; Pred. No. 7.2e-24; 

Matches 48; Conservative 1; Mismatches 2; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I M I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I : I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCASVCSLYQLENYCN 51 



RESULT 8 
IPHU 

insulin precursor [validated] - human 
N;Alternate names: preproinsulin 
C; Species: Homo sapiens (man) 

C;Date: 23-Oct-1981 #sequence_revision 23-Oct-1981 #text_change 08-Dec-2000 
C;Accession: A93222; A94253; A93216; A94251; A93144; A92075; A91186; 158114; 
A01579; S58661 

R;Bell, G.I.; Pictet, R.L.; Rutter, W.J.; Cordell, B.; Tischer, E. ; Goodman, 
H.M. 

Nature 284, 26-32, 1980 

A; Title: Sequence of the human insulin gene. 

A; Reference number: A93222; MUID : 8 0120725 ; PMID: 6243748 

A;Accession: A93222 

A;Molecule type: DNA 

A; Residues: 1-110 <BEL> 

A;Cross-references: GB:J00265; NID:gl86429; PIDN: AAA59172 . 1 ; PID:g386828 
R;Ullrich, A.; Dull, T.J.; Gray, A.; Brosius, J.; Sures, I. 
Science 209, 612-615, 1980 

A; Title: Genetic variation in the human insulin gene. 
A; Reference number: A94253; MUID : 8 0236313 ; PMID: 6248962 
A; Accession: A942 53 
A;Molecule type: DNA 
A;Residues: 1-110 <ULL> 

A;Cross-references: GB:J00265; NID:gl86429; PIDN :AAA5 9172 . 1 ; PID:g386828 



R;Bell, G.I.; Swain, W.F.; Pictet, R. ; Cordell, B.; Goodman, H.M. ; Rutter, W.J. 
Nature 282, 525-527, 1979 

A; Title: Nucleotide sequence of a cDNA clone encoding human preproinsulin . 
A;Reference number: A93216; MUID : 8005477 9 ; PMID:503234 
A; Access ion: A93216 
A;Molecule type: mRNA 
A; Residues: 1-110 <BEL2> 

A;Cross-references: GB:J00265; NID:gl86429; PIDN : AAA59172 . 1 ; PID:g386828 
R;Sures, I.; Goeddel, D.V.; Gray, A.; Ullrich, A. 
Science 208, 57-59, 1980 

A; Title: Nucleotide sequence of human preproinsulin complementary DNA, 
A; Reference number: A94251; MUID : 80147417 ; PMID: 6927840 
A;Accession: A94251 
A;Molecule type: mRNA 
A; Residues: 1-110 <SUR> 

A;Cross-references: GB:J00265; NID:gl86429; PIDN : AAA59172 . 1 ; PID:g386828 
R;Nicol, D.S.H.W.; Smith, L.F. 
Nature 187, 483-485, 1960 

A;Title: Amino-acid sequence of human insulin. 
A;Reference number: A93144 
A; Accession: A93144 
A;Molecule type: protein 
A;Residues: 25-54;90-110 <NIC> 

R;Oyer, P.E.; Cho, S.; Peterson, J.D.; Steiner, D.F. 
J. Biol. Chem. 246, 1375-1386, 1971 

A;Title: Studies on human proinsulin. Isolation and amino acid sequence of the 
human pancreatic C-peptide. 

A;Reference number: A92075; MUID : 71116410 ; PMID:5101771 
A; Access ion : A92075 
A;Molecule type: protein 
A;Residues: 57-87 <OYE> 

R;Ko, A.; Smyth, D.G.; Markussen, J.; Sundby, F. 
Eur. J. Biochem. 20, 190-199, 1971 

A; Title: Amino acid sequence of the C-peptide of human proinsulin. 
A;Reference number: A91186; MUID : 71257722 ; PMID:5560404 
A;Accession: A91186 
A;Molecule type: protein 
A; Residues: 57-87 <K0A> 

R;Lucassen, A.M.; Julier, C; Beressi, J. P.; Boitard, C. ; Froguel, P.; Lathrop, 
M.; Bell, J.I. 

Nature Genet. 4, 305-310, 1993 

A;Title: Susceptibility to insulin dependent diabetes mellitus maps to a 4 . 1 kb 
segment of DNA spanning the insulin gene and associated VNTR. 
A;Reference number: 158114; MUID : 9336442 8 ; PMID:8358440 
A;Accession: 158114 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-59,63-110 <RES> 

A;Cross-references: GB:L15440; NID:g307071; PIDN : AAA5917 9 . 1 ; PID:g307072 
R;Sieber, P.; Kamber, B.; Hartmann, A.; Joehl, A.; Riniker, B.; Rittel, W. 
Helv. Chim. Acta 57, 2617-2621, 1974 

A;Title: Totalsynthese von Humaninsulin unter gezielter Bildung der 
Disul f idbindungen . 

A;Reference number: A91636; MUID : 75077277 ; PMID:4443293 
A; Contents: annotation; synthesis 

A;Note: disulf ide-bonded human insulin was synthesized; the synthetic hormone 
was identical with the natural hormone in chemical and biological activities 



A;Note: article in German with English abstract 
R;Naithani, V.K. 

Hoppe-Seyler's Z. Physiol, Chem. 354, 659-672, 1973 
A;Title: The synthesis of C-peptide of human proinsulin. 
A;Reference number: A91658; MUID: 75040007 ; PMID:4803504 
A;Contents: annotation; synthesis of residues 57-87 
R;Geiger, R. ; Jaeger, G. ; Koenig, W. 
Chem. Ber. 106, 2347-2352, 1973 

A;Title: Synthesis of the complete sequence of human proinsulin C-peptide and 
its [Glu-9, Gln-11] analogue. 
A; Reference number: A90914 

A;Contents: annotation; synthesis of residues 57-87 
R;Kaufmann, J.E.; Irminger, J.C.; Halban, P. A. 
Biochem. J. 310, 869-874, 1995 

A; Title: Sequence requirements for proinsulin processing at the B-chain/C- 
peptide junction. 

A;Reference number: S58661; MUID : 96013185 ; PMID:7575420 

A; Contents: annotation; site-directed mutagenesis study of proteolytic 

processing 

C; Genetics : 

A; Gene: GDB : INS 

A; Cross-references: GDB: 119349; OMIM:176730 

A; Map position: llplS . 5-llpl5 . 5 

A;Introns: 63/1 

C ; Super family : insulin 

C; Keywords: hormone; pancreas 

F; 1-2 4 /Domain : signal sequence #status predicted <SIG> 

F; 2 5-5 4 /Domain : insulin chain B #status experimental <BCH> 

F;25-54, 90-110/Product: insulin #status experimental <MAT> 

F; 57- 87 /Domain : connecting C peptide #status experimental <CPEP> 

F; 90-110/Domain: insulin chain A #status experimental <ACH> 

F;31-96, 43-109, 95-100/Disulfide bonds: #status experimental 

Query Match 90.8%; Score 267; DB 1; Length 110; 

Best Local Similarity 60.5%; Pred. No. 1.6e-23; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKT 3 0 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 2 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEG 84 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I II I I I I I I I I I I I II I 
Db 85 SLQKRGI VEQCCTS I CS LYQLENYCN 110 



RESULT 9 
B42179 

insulin precursor - green monkey 

C; Species: Cercopithecus aethiops (green monkey, grivet) 

C;Date: 04-Mar-1993 #sequence_revision 18-Nov-1994 #text_change 16-Jul-1999 
C;Accession: B42179; A05232; S16494; S22056 
R;Seino, S.; Bell, G.I.; Li, W.H. 
Mol. Biol. Evol. 9, 193-203, 1992 

A; Title: Sequences of primate insulin genes support the hypothesis of a slowe 
rate of molecular evolution in humans and apes than in monkeys . 
A;Reference number: A42179; MUID : 92219953 ; PMID:1560757 



A; Accession : B4217 9 
A;Molecule type: DNA 
A; Residues: 1-110 <SEI> 

A;Cross-references: EMBL :X61092 ; NID:g22808; PIDN : CAA43405 . 1 ; PID:g22809 
A;Note: sequence extracted from NCBI backbone (NCBIN: 95185, NCBIP: 95194) 
R;Peterson, J.D.; Nehrlich, S.; Oyer, P.E.; Steiner, D.F. 
J. Biol. Chem. 247, 4866-4871, 1972 

A;Title: Determination of the amino acid sequence of the monkey, sheep, and dog 

proinsulin C-peptides by a semi -micro Edman degradation procedure. 

A;Reference number: A92111; MUID : 72258 016 ; PMID:4626369 

A; Access ion: A052 32 

A;Molecule type: protein 

A;Residues: 57-87 <PET> 

C; Genetics : 

A;Introns: 63/1 

C; Super family : insulin 

C; Keywords: hormone; pancreas 

F; 1-2 4 /Domain : signal sequence #status predicted <SIG> 

F; 2 5-54 /Domain : insulin chain B #status predicted <BCH> 

F;25-54, 90-110/Product : insulin #status predicted <MAT> 

F; 57-87 /Domain : connecting peptide #status experimental <CPEP> 

F; 90-110/Domain: insulin chain A ((status predicted <ACH> 

F;31-96, 43-109, 95-100/Disulf ide bonds: ((status predicted 

Query Match 90.8%; Score 267; DB 2; Length 110; 

Best Local Similarity 60.5%; Pred. No. 1.6e-23; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 30 

I I I I II I I II I I I I I I I I I I I I I I I I I I I I 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDPQVGQVELGGGPGAGSLQPLALEG 8 4 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I II II I I I I I I I I I I 
Db 85 SLQKRGI VEQCCTS I CS LYQLENYCN 110 



RESULT 10 
A42179 

insulin precursor - chimpanzee 

C; Species: Pan troglodytes (chimpanzee) 

C;Date: 04-Mar-1993 #sequence_revision 18-Nov-1994 #text_change 16-Jul-1999 
C;Accession: A42179; S22058 
R;Seino, S.; Bell, G.I.; Li, W.H. 
Mol. Biol. Evol. 9, 193-203, 1992 

A; Title: Sequences of primate insulin genes support the hypothesis of a slower 

rate of molecular evolution in humans and apes than in monkeys. 

A; Reference number: A42179; MUID: 92219953; PMID: 1560757 

A;Accession: A42179 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-110 <SEI> 

A;Cross-references: EMBL:X61089; NID:g38251; PIDN : CAA43403 . 1 ; PID:g38252 

A; Note: sequence extracted from NCBI backbone (NCBIP : 95067 ) 

C; Genetics : 

A;Introns: 63/1 

C; Super f amily : insulin 



Query Match 90.8%; Score 267; DB 2; Length 110; 

Best Local Similarity 60.5%; Pred. No. 1.6e-23; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 3 0 

I II I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEG 8 4 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I II I II I I I I I 
Db 8 5 SLQKRGIVEQCCTS ICSLYQLENYCN 110 



RESULT 11 
JQ0178 

insulin precursor - crab-eating macaque 

C; Species: Macaca fascicularis (crab-eating macaque) 

C;Date: 07-Sep-1990 #sequence_revision 07-Sep-1990 #text_change 16-Jul-1999 
C; Access ion: JQ017 8 

R;Wetekam, W. ; Groneberg, J.; Leineweber, M. ; Wengenmayer, F. ; Winnacker, E.L. 
Gene 19, 179-183, 1982 

A; Title: The nucleotide sequence of cDNA coding for preproinsulin from the 
primate Macaca fascicularis. 

A;Reference number: JQ0178; MUID : 83 08 0474 ; PMID:6184262 
A; Accession: JQ017 8 
A;Molecule type: mRNA 
A; Residues: 1-110 <WET> 

A;Cross-references: GB:J00336; NID:g342121; PIDN: AAA36849 . 1 ; PID:g342122 
C; Superfamily : insulin 

F; 1-2 4 /Domain : signal sequence ({status predicted <SIG> 

F;25-54, 90-110/Product: insulin #status predicted <MAT> 

F; 2 5-5 4 /Domain : insulin chain B #status predicted <BCH> 

F; 55-89/Domain: insulin connecting C peptide #status predicted <CPT> 

F; 90-110/ Domain: insulin chain A #status predicted <ACH> 

F; 31-96, 43-109, 95-100/Disulf ide bonds: #status predicted 

Query Match 90.8%; Score 267; DB 2; Length 110; 

Best Local Similarity 60.5%; Pred. No. 1.6e-23; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 3 0 

I I I I I II I I I I I I I I I I I I I I I I I II I I I I 
Db 25 FWQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDPQVGQVELGGGPGAGSLQPLALEG 84 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I 
Db 85 SLQKRGIVEQCCTS ICSLYQLENYCN 110 



RESULT 12 
INWH1S 

insulin - sei whale 

C; Species: Balaenoptera borealis (sei whale) 

C;Date: 13-Jul-1981 #sequence_revision 13-Jul-1981 #text_change 16-Jul-1999 
C;Accession: A01582 

R;Ishihara, Y.; Saito, T-; Ito, Y.; Fujino, M. 



Nature 181, 1468-1469, 1958 

A; Title: Structure of sperm- and sei-whale insulins and their breakdown by wha 
pepsin , 

A; Reference number: A93142 
A; Access ion: AO 15 82 
A;Molecule type : protein 
A;Residues: 1-30;31-51 <ISH> 
C; Super family: insulin 
C; Keywords: hormone; pancreas 

F; 1-3 0 /Domain : insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F; 3 1-51 /Domain : insulin chain A #status experimental <ACH> 
F;7-37, 19-50, 36-41/Disulfide bonds: ((status predicted 

Query Match 89.6%; Score 263.5; DB 1; Length 51; 

Best Local Similarity 92.3%; Pred. No. 2.1e-23; 

Matches 48; Conservative 0; Mismatches 3; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKA-GTVEQCCASTCSLYQLENYCN 51 



RESULT 13 
1NGT 

insulin - goat 

C; Species: Capra aegagrus hircus (domestic goat) 

C;Date: 13-Jul-1981 #sequence_revision 13-Jul-1981 #text_change 16-Jul-1999 
C;Accession: A01586 
R; Smith, L.F. 

Am. J. Med. 40, 662-666, 1966 

A;Title: Species variation in the amino acid sequence of insulin. 

A; Reference number: A90029; MUID : 66160119 ; PMID: 5949593 

A;Accession: A01586 

A;Molecule type: protein 

A; Residues: 1-30;31-51 <SMI> 

C; Superf amily : insulin 

C; Keywords: hormone; pancreas 

F; 1-30/Domain: insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F;31-51/Domain: insulin chain A #status experimental <ACH> 
F;7-37, 19-50, 36-41/Disulf ide bonds: #status predicted 

Query Match 89.6%; Score 263.5; DB 1; Length 51; 

Best Local Similarity 90.4%; Pred. No. 2.1e-23; 

Matches 47; Conservative 1; Mismatches 3; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKTRGI VEQCCTS I CS LYQLENYCN 52 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II I I I I I I M 
Db 1 FVNQH LC G SHLVEALYLVCGERGFFYTPKA-G I VEQCCAGVCS LYQLENYCN 51 



RESULT 14 
INCMA 

insulin - Arabian camel (tentative sequence) 
C;Species: Camelus dromedarius (Arabian camel) 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 16-Jul-1999 



C; Accession: A92782 
R;Danho, W.O. 

J. Fac. Med, Baghdad 14, 16-28, 1972 

A;Title: The isolation and characterization of insulin of camel (Camelus 
dromedarius) . 

A; Reference number: A92782 

A;Accession: A92782 

A;Molecule type: protein 

A; Residues: 1-30;31-51 <DAN> 

C; Super family : insulin 

C; Keywords: hormone; pancreas 

F; 1-3 0 /Domain : insulin chain B #status experimental <BCH> 
F; 1-30, 31-51/Product : insulin #status experimental <MAT> 
F; 3 1-51 /Domain : insulin chain A #status experimental <ACH> 
F; 7-37, 19-50, 36-41/Disulf ide bonds: #status predicted 

Query Match 8 9.6%; Score 2 63.5; DB 1; Length 51; 

Best Local Similarity 90.4%; Pred. No. 2.1e-23; 

Matches 47; Conservative 1; Mismatches 3; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I II II I I I I I II I I I I I I I I I I II I I I I I I I I I : I I I I I I I I I I I 
Db 1 FANQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCASVCSLYQLENYCN 51 



RESULT 15 
IPPG 

insulin precursor - pig 

C; Species: Sus scrofa domestica (domestic pig) 

C;Date: 22-Jun-1981 #sequence_revision 22-Jun-1981 #text_change 16-Jul-1999 
C;Accession: A01583; A94572; S16492; A60835; B60835 
R;Chance, R.E.; Ellis, R.M. ; Bromer, W.W. 
Science 161, 165-167, 1968 

A;Title: Porcine proinsulin: characterization and amino acid sequence. 

A;Reference number: A94240; MUID : 68286485 ; PMID:5657063 

A;Accession: A01583 

A;Molecule type: protein 

A; Residues: 1-34, 'Q 1 ,36-84 <CHA> 

R; Chance, R.E. 

submitted to the Atlas, July 197 0 

A;Reference number: A94572 

A;Accession: A94572 

A;Molecule type: protein 

A; Residues: 1-84 <CH2> 

R;Brown, H. ; Sanger, F. ; Kitai, R. 

Biochem. J. 60, 556-565, 1955 

A; Title: The structure of pig and sheep insulins. 

A; Reference number: A90344 

A;Accession: S16492 

A;Molecule type: protein 

A; Residues: 1-30; 31-51 <BRO> 

R;Snel, L.; Damgaard, U. 

Horm. Metab. Res. 20, 476-480, 1988 

A; Title: Proinsulin heterogeneity in pigs. 

A;Reference number: A60835; MUID : 89032178 ; PMID:3181865 

A;Accession: A60835 

A;Molecule type: protein 



A; Residues: 33-38,40-62 <SNE> 

A;Note: the authors report the characterization of a connecting peptide variant 

lacking Ala-39 

A; Accession: B608 35 

A;Molecule type: protein 

A; Residues: 33-62 <SN2> 

R;Blundell, T . ; Dodson, G. ; Hodgkin, D. ; Mercola, D. 
Adv. Protein Chem. 26, 279-402, 1972 

A;Title: Insulin, the structure in the crystal and its reflection in chemistry 
and biology. 

A; Reference number: A90017 

A; Contents: annotation; X-ray crystallography, 1.9 angstroms 

C; Superf amily : insulin 

C; Keywords: hormone; pancreas 

F; 1-3 0 /Domain : insulin chain B #status experimental <BCH> 

F; 1-30, 64-84/Product : insulin ((status experimental <MAT> 

F; 33- 6 3/ Domain : connecting peptide ((status experimental <CPEP> 

F; 64-84/Domain: insulin chain A #status experimental <ACH> 

F;7-70, 19-83, 69-74/Disulf ide bonds: ((status experimental 

Query Match 89.5%; Score 263; DB 1; Length 84; 

Best Local Similarity 60.7%; Pred. No. 3.6e-23; 

Matches 51; Conservative 0; Mismatches 1; Indels 32; Gaps 1; 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 3 0 





Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKARREAENPQAGAVELGGGLGGLQALALEGPP 60 



31 — RGIVEQCCTSICSLYQLENYCN 52 




Db 



61 QKRGIVEQCCTSICSLYQLENYCN 84 



Search completed: July 15, 2004, 16:37:33 
Job time : 5.98756 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: July 15, 2004, 16:37:41 ; Search time 21.6343 Seconds 

(without alignments) 
751.267 Million cell updates/sec 



Title: US-09-423-100-5 
Perfect score: 294 

Sequence: 1 FVNQHLCGSHLVEALYLVCG IVEQCCTSICSLYQLENYCN 52 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1285345 seqs, 312560633 residues 

Total number of hits satisfying chosen parameters: 1285345 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Published_Applications_AA: * 

1 : / cgn2_6/ptodata/ 1/pubpaa/US 07_PUBCOMB . pep : * 

2 : /cgn2_6/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 

3: /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 

4: /cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep: * 

5: /cgn2_6/ptodata/l/pubpaa/US07_NEW_PUB.pep: * 

6 : / cgn2_6/ptodata/ 1/pubpaa/ PCTUS_PUBCOMB . pep : * 

7: /cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep: + 

8: /cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep: * 

9: /cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep:* 

10 : /cgn2_6/ptodata/ l/pubpaa/US09B_PUBCOMB . pep : * 

11: /cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep: * 

12: /cgn2_6/ptodata/l/pubpaa/US0 9_NEW_PUB.pep:* 

13: /cgn2_6/ptodata/l/pubpaa/USlOA_PUBCOMB.pep: * 

14 : /cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep: * 

15: /cgn2_6/ptodata/l/pubpaa/US10C_PUBCOMB.pep: * 
16 : /cgn2_6/ptodata/ l/pubpaa/US10_NEW__PUB . pep : * 
17 : /cgn2_6/ptodata/ l/pubpaa/US60_NEW_PUB . pep : * 
18: /cgn2_6/ptodata/l/pubpaa/US60_PUBCOMB.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


294 


100. 


0 


52 


13 


US-10-054-873-5 


Sequence 


5, Appli 


2 


294 


100. 


0 


107 


13 


US-10-054-873-6 


Sequence 


6, Appli 


3 


294 


100. 


0 


137 


16 


US-10-101-454-39 


Sequence 


39, Appl 


4 


294 


100. 


0 


145 


16 


US-10-101-454-45 


Sequence 


45, Appl 


5 


294 


100. 


0 


146 


16 


US-10-101-454-48 


Sequence 


48, Appl 


6 


294 


100. 


0 


150 


13 


US-10-054-873-7 


Sequence 


7, Appli 


7 


278.5 


94 . 


7 


51 


10 


US-09-858-935B-5 


Sequence 


5, Appli 


8 


278.5 


94. 


7 


51 


12 


US-10-444-649-3 


Sequence 


3, Appli 


9 


278.5 


94. 


7 


51 


12 


US-10-444-701-3 


Sequence 


3, Appli 


10 


278.5 


94 . 


7 


51 


12 


US-10-271-869-5 


Sequence 


5, Appli 


11 


278.5 


94. 


7 


51 


13 


US-10-028-410-3 


Sequence 


3, Appli 


12 


278.5 


94 . 


7 


51 


14 


US-10-444-326-3 


Sequence 


3, Appli 


13 


278.5 


94 . 


7 


51 


16 


US-10-444-262-3 


Sequence 


3, Appli 


14 


275.5 


93. 


7 


104 


16 


US-10-101-454-15 


Sequence 


15, Appl 


15 


275.5 


93. 


7 


124 


9 


US-09-894-711-18 


Sequence 


18, Appl 


16 


275.5 


93. 


7 


138 


9 


US-09-861-687-19 


Sequence 


19, Appl 


17 


275.5 


93. 


7 


138 


12 


US-10-620-651-19 


Sequence 


19, Appl 


18 


275.5 


93. 


7 


140 


16 


US-10-101-454-33 


Sequence 


33, Appl 


19 


275.5 


93. 


7 


140 


16 


US-10-101-454-42 


Sequence 


42, Appl 


20 


273 


92. 


9 


50 


13 


US-10-066-009A-3 


Sequence 


3, Appli 


21 


271.5 


92. 


3 


102 


16 


US-10-101-454-36 


Sequence 


36, Appl 


22 


267 


90. 


8 


86 


9 


US-09-878-380-1 


Sequence 


1, Appli 


23 


267 


90. 


8 


86 


10 


US-09-858-935B-4 


Sequence 


4, Appli 


24 


267 


90. 


8 


86 


12 


US-10-444-649-2 


Sequence 


2, Appli 


25 


267 


90. 


8 


86 


12 


US-10-444-701-2 


Sequence 


2, Appli 


26 


267 


90. 


8 


86 


12 


US-10-271-869-4 


Sequence 


4, Appli 


27 


267 


90. 


8 


86 


13 


US-10-028-410-2 


Sequence 


2, Appli 


28 


267 


90. 


8 


86 


13 


US-10-054-873-4 


Sequence 


4, Appli 


29 


267 


90. 


8 


86 


14 


US-10-444-326-2 


Sequence 


2, Appli 


30 


267 


90. 


8 


86 


16 


US-10-444-262-2 


Sequence 


2, Appli 


31 


267 


90. 


8 


96 


9 


US-09-947-563-4 


Sequence 


4, Appli 


32 


267 


90. 


8 


110 


9 


US-09-205-658-125 


Sequence 


125, App 


33 


267 


90. 


8 


110 


9 


US-09-815-229-3 


Sequence 


3, Appli 


34 


267 


90. 


8 


110 


9 


US-09-804-409A-9 


Sequence 


9, Appli 


35 


267 


90. 


8 


110 


10 


US-09-969-748C-6 


Sequence 


6, Appli 


36 


267 


90. 


8 


110 


10 


US-09-963-693-125 


Sequence 


12 5, App 


37 


267 


90. 


8 


110 


12 


US-10-411-037-44 


Sequence 


44, Appl 


38 


267 


90. 


8 


110 


12 


US-10-411-026-44 


Sequence 


44, Appl 


39 


267 


90. 


8 


110 


14 


US-10-038-686-1 


Sequence 


1, Appli 


40 


267 


90. 


8 


110 


14 


US-10-328-813-2 


Sequence 


2, Appli 


41 


267 


90. 


8 


110 


15 


US-10-383-285-2 


Sequence 


2, Appli 


42 


267 


90. 


8 


110 


15 


US-10-346-563-2 


Sequence 


2, Appli 


43 


267 


90. 


8 


110 


15 


US-10-321-717-2 


Sequence 


2, Appli 


44 


267 


90. 


8 


110 


16 


US-10-410-962-44 


Sequence 


44, Appl 


45 


267 


90. 


8 


110 


16 


US-10-411-049-44 


Sequence 


44, Appl 



ALIGNMENTS 



RESULT 1 
US-10-054-873-5 

; Sequence 5, Application US/10054873 
; Publication No. US20020164712A1 
; GENERAL INFORMATION: 

APPLICANT: Gan, Zhong Ru 



; TITLE OF INVENTION: Chimeric Protein Containing an 

; Intramolecular Chaperone-Like Sequence 

; NUMBER OF SEQUENCES : 7 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Townsend and Townsend and Crew LLP 

; STREET: Two Embarcadero Center, Eighth Floor 

; CITY: San Francisco 

STATE: California 
; COUNTRY: USA 

ZIP : 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/054,873 

FILING DATE: 22-Jan-2002 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/CN9 8/ 00052 

FILING DATE: 31-MAR-1998 

APPLICATION NUMBER: US 09/423,100 

FILING DATE: ll-DEC-2000 
ATTORNEY/AGENT INFORMATION: 

NAME: Mycroft, Frank J 
; REGISTRATION NUMBER: 46,946 

REFERENCE/DOCKET NUMBER: 02 0 167-000130US 
; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 52 amino acids 

; TYPE: amino acid 

STRANDEDNESS : <Unknown> 
; TOPOLOGY: linear 

; MOLECULE TYPE: protein 

; SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

US-10-054-873-5 

Query Match 100.0%; Score 294; DB 13; Length 52; 

Best Local Similarity 100.0%; Pred. No. 5e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 



RESULT 2 
US-10-054-873-6 

; Sequence 6, Application US/10054873 
; Publication No. US20020164712A1 

GENERAL INFORMATION: 
; APPLICANT: Gan, Zhong Ru 

; TITLE OF INVENTION: Chimeric Protein Containing an 

; Intramolecular Chaperone-Like Sequence 

; NUMBER OF SEQUENCES: 7 

; CORRESPONDENCE ADDRESS: 



; ADDRESSEE: Townsend and Townsend and Crew LLP 

; STREET: Two Embarcadero Center, Eighth Floor 

CITY: San Francisco 
; STATE: California 

; COUNTRY: USA 

; ZIP: 94111-3834 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/ 054 , 87 3 

FILING DATE: 22-Jan-2002 

CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: WO PCT/CN9 8 / 00052 

; FILING DATE: 31-MAR-199 8 

APPLICATION NUMBER: US 09/423,100 
FILING DATE: ll-DEC-2000 
ATTORNEY/AGENT INFORMATION: 
NAME: Mycroft, Frank J 
REGISTRATION NUMBER: 46,946 
REFERENCE/DOCKET NUMBER: 02 0167-000130US 
INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 107 amino acids 

TYPE: amino acid 
STRANDEDNESS : <Unknown> 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-10-054-873-6 

Query Match 100.0%; Score 294; DB 13; Length 107; 

Best Local Similarity 100.0%; Pred. No. l.le-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gap 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 56 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 107 



RESULT 3 

US-10-101-454-39 

; Sequence 39, Application US/10101454 
; Publication No. US20040110664A1 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
; Halstrom, John 

; Jonassen, lb 

; Andersen, Asser Sloth 

; Markussen, Jan 

; TITLE OF INVENTION: ACYLATED INSULIN 

NUMBER OF SEQUENCES: 4 9 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Novo Nordisk of North America, Inc. 



STREET: 405 Lexington Avenue, 64th Floor 
CITY: New York 
STATE: New York 

COUNTRY: United States of America 
ZIP : 10174-6401 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/101 , 454 
FILING DATE: 20-Mar-2002 
CLAS S I FI CAT I ON : <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/400,256 
FILING DATE: 03-MAR-1995 
ATTORNEY/AGENT INFORMATION: 

NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/DOCKET NUMBER: 3985.220-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-8 67-0123 
TELEFAX: 212-87 8-9655 
INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 137 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
US-10-101-454-39 



Query Match 100.0%; Score 294; DB 16 

Best Local Similarity 100.0%; Pred. No. 1.4e-28 
Matches 52; Conservative 0; Mismatches 0 



Length 137; 

Indels 0; Gaps 0; 



QY 



Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
86 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 137 



RESULT 4 

US-10-101-454-45 

; Sequence 45, Application US/10101454 
; Publication No. US20040110664A1 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
; Halstrom, John 

; Jonassen, lb 

; Andersen, Asser Sloth 

; Markussen, Jan 

TITLE OF INVENTION: ACYLATED INSULIN 
; NUMBER OF SEQUENCES: 4 9 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Novo Nordisk of North America, Inc. 
; STREET: 405 Lexington Avenue, 64th Floor 



CITY: New York 
; STATE : New York 

; COUNTRY: United States of America 

; ZIP: 10174-6401 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/10/101, 454 

FILING DATE: 20-Mar-2002 
; CLASSIFICATION: <Unknown> 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/400,256 

FILING DATE: 03-MAR-19 95 
ATTORNEY/AGENT INFORMATION: 
; NAME: Lambiris, Elias J. 

REGISTRATION NUMBER: 33,728 
; REFERENCE/DOCKET NUMBER: 3985.220-US 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 212-867-0123 
; TELEFAX: 212-878-9655 

; INFORMATION FOR SEQ ID NO: 45: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 14 5 amino acids 

; TYPE: amino acid 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 
; SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

US-10-101-454-45 

Query Match 100.0%; Score 294; DB 16; Length 145; 

Best Local Similarity 100.0%; Pred. No. 1.5e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gap 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I II I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 94 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 145 



RESULT 5 

US-10-101-454-48 

; Sequence 48, Application US/10101454 
; Publication No. US2004 0110664A1 
GENERAL INFORMATION: 

APPLICANT: Havelund, Svend 
; Halstrom, John 

; Jonassen, lb 

; Andersen, Asser Sloth 

; Markussen, Jan 

TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 4 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Novo Nordisk of North America, Inc. 
; STREET: 405 Lexington Avenue, 64th Floor 

CITY: New York 



; STATE: New York 

; COUNTRY: United States of America 

ZIP : 10174-6401 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/101, 454 
FILING DATE: 20-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/400,256 
FILING DATE: 03-MAR-1995 
ATTORNEY/AGENT INFORMATION: 
; NAME: Lambiris, Elias J. 

REGISTRATION NUMBER: 33,72 8 
REFERENCE/ DOCKET NUMBER: 39 85.22 0-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX: 212-878-9655 
INFORMATION FOR SEQ ID NO: 48: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 14 6 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
US-10-101-454-48 

Query Match 100.0%; Score 294; DB 16; Length 146; 

Best Local Similarity 100.0%; Pred. No. 1.5e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I 
Db 95 FVNQHLCGSHLVEALYLVCGERGFFYT PKTRGI VEQCCT S I CSLYQLEN YCN 14 6 



RESULT 6 
US-10-054-873-7 

; Sequence 7, Application US/10054873 
; Publication No. US2002 0164712A1 
; GENERAL INFORMATION: 
; APPLICANT: Gan, Zhong Ru 

; TITLE OF INVENTION: Chimeric Protein Containing an 

; Intramolecular Chaperone-Like Sequence 

NUMBER OF SEQUENCES: 7 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Townsend and Townsend and Crew LLP 

; STREET: Two Embarcadero Center, Eighth Floor 

; CITY: San Francisco 

; STATE: California 

COUNTRY: USA 
ZIP : 94111-3834 

COMPUTER READABLE FORM: 



; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/10/054,873 

FILING DATE: 22-Jan-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/CN98/00052 
FILING DATE: 31-MAR-1998 
APPLICATION NUMBER: US 09/423,100 
FILING DATE: ll-DEC-2000 
ATTORNEY/AGENT INFORMATION: 
; NAME: Mycroft, Frank J 

REGISTRATION NUMBER: 46,946 
REFERENCE/DOCKET NUMBER: 020167-000130US 
INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 150 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: <Unknown> 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
; SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

US-10-054-873-7 

Query Match 100.0%; Score 294; DB 13; Length 150; 

Best Local Similarity 100.0%; Pred. No. 1.6e-28; 

Matches 52; Conservative 0; Mismatches 0; Indels 0; Gap 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I f I I I I I I I I I I I I I I 
Db . 99 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 150 



RESULT 7 

US-09-858-935B-5 

; Sequence 5, Application US/09858935B 

; Publication No. US20030069177A1 

; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Filvaroff, Ellen 

; APPLICANT: Lowman, Henry B. 

; TITLE OF INVENTION: METHOD FOR TREATING CARTILAGE DISORDERS 

FILE REFERENCE: P1794R1 

; CURRENT APPLICATION NUMBER: US/ 09/ 8 58 , 935B 

; CURRENT FILING DATE: 2002-07-02 

; PRIOR APPLICATION NUMBER: US 60/248,985 

; PRIOR FILING DATE: 2000-11-15 

; PRIOR APPLICATION NUMBER: US 60/204,490 

; PRIOR FILING DATE: 2000-05-16 

; NUMBER OF SEQ ID NOS : 153 

; SEQ ID NO 5 
; LENGTH: 51 
; TYPE: PRT 

; ORGANISM: Homo sapiens 



US-09-858-935B-5 



Query Match 94.7%; Score 278.5; DB 10; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 8 
US-10-444-649-3 

; Sequence 3, Application US/10444649 
; Publication No. US20040033951A1 
; GENERAL INFORMATION: 
; APPLICANT: Dubaquie, Yves 
APPLICANT: Lowman, Henry 
; TITLE OF INVENTION: PROTEIN VARIANTS 
; FILE REFERENCE: P1712R1 

; CURRENT APPLICATION NUMBER: US/10/444, 649 

; CURRENT FILING DATE: 2003-05-22 

; PRIOR APPLICATION NUMBER: US/ 09/724 , 479 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: US/ 09/477 , 923 

PRIOR FILING DATE: 2000-01-05 
; NUMBER OF SEQ ID NOS : 6 
; SEQ ID NO 3 

LENGTH: 51 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-444-649-3 

Query Match 94.7%; Score 278.5; DB 12; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 9 
US-10-444-701-3 

; Sequence 3, Application US/10444701 

; Publication No. US20040033952A1 

; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Lowman, Henry 

; TITLE OF INVENTION: PROTEIN VARIANTS 

; FILE REFERENCE: P1712R1 

; CURRENT APPLICATION NUMBER: US/10/444 , 1 01 

; CURRENT FILING DATE: 2003-05-22 

; PRIOR APPLICATION NUMBER: US/ 09/723 , 866 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: US/ 09/477 , 923 

; PRIOR FILING DATE: 2000-01-05 



; NUMBER OF SEQ ID NOS : 6 
; SEQ ID NO 3 
; LENGTH: 51 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-444-701-3 

Query Match 94.7%; Score 278.5; DB 12; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 10 
US-10-271-869-5 

; Sequence 5, Application US/10271869 

; Publication No. US200302 11992A1 

; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Filvaroff, Ellen 

; APPLICANT: Lowman, Henry B. 

; TITLE OF INVENTION: METHOD FOR TREATING CARTILAGE DISORDERS 
; FILE REFERENCE: P1794R1 

; CURRENT APPLICATION NUMBER: US/ 10/27 1 , 8 69 

; CURRENT FILING DATE: 2002-10-16 

; PRIOR APPLICATION NUMBER: US/ 09/ 858 , 935 

PRIOR FILING DATE: 2002-07-02 
; PRIOR APPLICATION NUMBER: US 60/248,985 

PRIOR FILING DATE: 2000-11-15 
; PRIOR APPLICATION NUMBER: US 60/204,490 
; PRIOR FILING DATE: 2000-05-16 
; NUMBER OF SEQ ID NOS: 153 
; SEQ ID NO 5 
LENGTH: 51 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-271-869-5 

Query Match 94.7%; Score 278.5; DB 12; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I 1 I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 11 
US-10-028-410-3 

; Sequence 3, Application US/10028410 

; Publication No. US20020160955A1 

; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Lowman, Henry 



TITLE OF INVENTION: PROTEIN VARIANTS 
; FILE REFERENCE: P1712R1-1 

; CURRENT APPLICATION NUMBER: US/10/028, 410 

; CURRENT FILING DATE: 2001-12-19 

; PRIOR APPLICATION NUMBER: US/09/477 , 924 

; PRIOR FILING DATE: 2000-01-05 

; NUMBER OF SEQ ID NOS : 6 

; SEQ ID NO 3 

LENGTH: 51 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-028-410-3 

Query Match 94.7%; Score 278.5; DB 13; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

II I I I I I II I I I I I I I I I I I I I I I I I I I II I I II I I II I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 12 
US-10-444-326-3 

; Sequence 3, Application US/10444326 

; Publication No. US20030191065A1 

; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Lowman, Henry 

; TITLE OF INVENTION: PROTEIN VARIANTS 

; FILE REFERENCE: P1712R1 

; CURRENT APPLICATION NUMBER: US/10/444 , 326 

; CURRENT FILING DATE: 2003-05-22 

; PRIOR APPLICATION NUMBER: US/09/723, 866 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: US/09/477 , 923 

; PRIOR FILING DATE: 2000-01-05 

; NUMBER OF SEQ ID NOS: 6 

; SEQ ID NO 3 

LENGTH: 51 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-444-326-3 

Query Match 94.7%; Score 278.5; DB 14; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 13 
US-10-444-262-3 

; Sequence 3, Application US/10444262 
; Publication No. US200400238 83A1 



; GENERAL INFORMATION: 

; APPLICANT: Dubaquie, Yves 

; APPLICANT: Lowman, Henry 

; TITLE OF INVENTION: PROTEIN VARIANTS 

; FILE REFERENCE: P1712R1 

; CURRENT APPLICATION NUMBER: US/ 10/444 , 262 

; CURRENT FILING DATE: 2003-05-22 

; PRIOR APPLICATION NUMBER: US/ 09/724 , 47 8 

; PRIOR FILING DATE: 2000-11-28 

; PRIOR APPLICATION NUMBER: US/ 09/477 , 923 

; PRIOR FILING DATE: 2000-01-05 

; NUMBER OF SEQ ID NOS : 6 

; SEQ ID NO 3 

LENGTH: 51 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-444-262-3 

Query Match 94.7%; Score 278.5; DB 16; Length 51; 

Best Local Similarity 98.1%; Pred. No. 4.1e-27; 

Matches 51; Conservative 0; Mismatches 0; Indels 1; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTSICSLYQLENYCN 51 



RESULT 14 
US-10-101-454-15 

; Sequence 15, Application US/10101454 

; Publication No. US2004 0110664A1 

; GENERAL INFORMATION: 

; APPLICANT: Havelund, Svend 

; Halstrom, John 

; Jonas sen, lb 

; Andersen, Asser Sloth 

; Markussen, Jan 

TITLE OF INVENTION: ACYLATED INSULIN 
NUMBER OF SEQUENCES: 4 9 
CORRESPONDENCE ADDRESS : 
; ADDRESSEE: Novo Nordisk of North America, Inc. 

; STREET: 405 Lexington Avenue, 64th Floor 

CITY: New York 

STATE: New York 
; COUNTRY: United States of America 

ZIP: 10174-6401 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/101,454 

FILING DATE: 20-Mar-2002 
; CLASSIFICATION: <Unknown> 

; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/400,256 



FILING DATE: 03-MAR-1995 
ATTORNEY/ AGENT INFORMATION : 

NAME: Lambiris, Elias J. 
REGISTRATION NUMBER: 33,728 
REFERENCE/ DOCKET NUMBER: 39 85.22 0-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 212-867-0123 
TELEFAX: 212-878-9655 
INFORMATION FOR SEQ ID NO: 15: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 104 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
US-10-101-454-15 

Query Match 93.7%; Score 275.5; DB 16; Length 104; 

Best Local Similarity 90.9%; Pred. No. 2.1e-26; 

Matches 50; Conservative 2; Mismatches 0; Indels 3; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I 1 I I I I I II I I I I I I II I I I I 
Db 50 FVNQHLCGSHLVEALYLVCGERGFFYTPKSDDAKGIVEQCCTSICSLYQLENYCN 104 



RESULT 15 
US-09-894-711-18 

; Sequence 18, Application US/09894711 
; Patent No. US20020137144A1 
; GENERAL INFORMATION: 

; APPLICANT: Kjeldsen, Thomas Borglum 

APPLICANT: Ludvigsen, Svend 
; TITLE OF INVENTION: Method for making insulin precursors and 

; TITLE OF INVENTION: insulin precursor analogues having improved fermentation 

TITLE OF INVENTION: yield in yeast 
; FILE REFERENCE: 6148.400-US 

; CURRENT APPLICATION NUMBER: US/09/894, 711 
; CURRENT FILING DATE: 2001-06-28 
; PRIOR APPLICATION NUMBER: PA 2000 00443 
; PRIOR FILING DATE: 2000-03-17 

PRIOR APPLICATION NUMBER: PA 1999 01869 

PRIOR FILING DATE: 1999-12-29 
; PRIOR APPLICATION NUMBER: 60/211,081 
; PRIOR FILING DATE: 2000-06-13 
; PRIOR APPLICATION NUMBER: 60/181,450 

PRIOR FILING DATE: 2000-02-10 
; PRIOR APPLICATION NUMBER: 09/740,359 
; PRIOR FILING DATE: 2000-12-19 
; NUMBER OF SEQ ID NOS : 20 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 18 
LENGTH: 124 
TYPE: PRT 
; ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Synthetic 



US-09-894-711-18 



Query Match 93.7%; Score 275.5; DB 9; Length 124; 

Best Local Similarity 94.3%; Pred. No. 2.5e-26; 

Matches 50; Conservative 1; Mismatches 1; Indels 1; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPK-TRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I II I I I I I I I I I I II I II : I I I I I I I I I I I I I I I I I I I I I 
Db 72 FVNQHLCGSHLVEALYLVCGERGFFYTPKAAKGIVEQCCTSICSLYQLENYCN 124 



Search completed: July 15, 2004, 17:05:08 
Job time : 21.6343 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table : 



July 15, 2004, 16:29:50 ; Search time 17.7537 Seconds 

(without alignments) 
924.141 Million cell updates/sec 

US-09-423-100-5 
294 

1 FVNQHLCGSHLVEALYLVCG IVEQCCTSICSLYQLENYCN 52 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1017041 seqs, 315518202 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1017041 



Database : SPTREMBL_25 : * 

sp_archea : * 
sp_bacteria : * 
sp_f ungi : * 
sp_human : * 
sp_invertebrate : * 
sp_mammal : * 
sp_mhc : * 
sp_organelle : * 
9: sp__phage:* 
10: sp_plant:* 
11: sp_rodent:* 
12 : sp_virus : * 
13: sp_vertebrate : * 
14: sp_unclassif ied: * 
15 : sp_rvirus : * 
16: sp_bacteriap : * 
17: sp_archeap:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 
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ALIGNMENTS 



RESULT 1 
Q8HXV2 

ID Q8HXV2 PRELIMINARY; 
AC Q8HXV2; 

DT 01-MAR-2003 (TrEMBLrel . 
DT 01-MAR-2003 (TrEMBLrel. 
DT 01-OCT-2003 (TrEMBLrel. 



PRT; 110 AA. 
23, Created) 

23, Last sequence update) 
25, Last annotation update) 



DE Insulin precursor. 

GN INS. 

OS Pongo pygmaeus (Orangutan) . 

OC Eukaryota; Metazoa; Chorciata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pongo. 

OX NCBIJTaxID=9 60 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Stead J.D.H., Jeffreys A. J.; 

RT "Haplotype diversity at the insulin region."; 

RL Submitted (JUL-2 002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AY137503; AAN06937.1; -. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO:0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 110 AA; 12038 MW; 22D2B32B94F520F8 CRC64; 

Query Match 90.8%; Score 267; DB 6; Length 110; 

Best Local Similarity 60.5%; Pred. No. 7.8e-29; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 3 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I 1 I I 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEG 84 

Qy 31 RGI VEQCCT S I C S LYQLEN YCN 52 

I I I I I I I I I I 1 II I I I I I I I I I 
Db 85 SLQKRGIVEQCCTSICSLYQLENYCN 110 



RESULT 2 




Q8WNW6 




ID 


Q8WNW6 PRELIMINARY; PRT; 110 AA. 




AC 


Q8WNW6; 




DT 


01-MAR-2002 (TrEMBLrel. 20, Created) 




DT 


01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 




DT 


01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 




DE 


Preproinsulin . 




OS 


Felis silvestris catus (Cat) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Carnivora; Fissipedia; Felidae; 


Felis . 


OX 


NCBI TaxID=9685; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


TISSUE=Pancreas ; 




RA 


Okamoto S., Morimatsu M. ; 




RT 


"cat insulin . " ; 




RL 


Submitted (MAY-2000) to the EMBL/ GenBank/DDBJ databases. 


CC 


-!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 




CC 


-!- SIMILARITY: BELONGS TO THE INSULIN/ IGF/RELAXIN 


FAMILY. 


DR 


EMBL; AB043535; BAB84110.1; -. 




DR 


GO; GO: 0005576; C : extracellular ; IEA. 





DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; IlGF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 110 AA; 12069 MW; 95FB6E17 0C7BECA4 CRC64; 

Query Match 85.4%; Score 251; DB 6; Length 110; 

Best Local Similarity 55.8%; Pred. No. 1.2e-26; 

Matches 48; Conservative 2; Mismatches 2; Indels 34; Gaps 1 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 3 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYTPKARREAEDLQGKDAELGEAPGAGGLQPSALEA 84 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I hllllllhlll 
Db 85 PLQKRGIVEQCCASVCSLYQLEHYCN 110 



RESULT 3 
Q9I8Q7 

ID Q9I8Q7 PRELIMINARY; PRT; 106 AA. 

AC Q9I8Q7; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin . 

OS Rana pipiens (Northern leopard frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Amphibia; Batrachia; Anura; Neobatrachia ; Ranoidea; Ranidae; Rana. 

OX NCBI_TaxID=8404; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 03625 07; PubMed=10 8 1 8274 ; 

RA Irwin D.M., Sivarajah P.; 

RT "Proinsulin cDNAs from the leopard frog, Rana pipiens: evolution of 

RT proinsulin processing."; 

RL Comp. Biochem. Physiol. 125B : 4 05-410 ( 2000 ) . 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ IGF/RELAXIN FAMILY. 

DR EMBL; AF227187; AAF87285.1; -. 

DR HSSP; P01315; 1SDB. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F: hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax. 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; IlGF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 106 AA; 12183 MW; 3A87 0EEC7 02 17 F92 CRC64; 



Query Match 74.7%; Score 219.5; DB 13; Length 106; 

Best Local Similarity 49.4%; Pred. No. 2.5e-22; 



Matches 41; Conservative 7; Mismatches 4; Indels 31; Gaps 1 



Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTR 31 

I I I : I I I I I I I I I I I : I M : I I I I I : I :: I 
Db 2 4 FDNQYLCGSHLVEALYMVCGDRGFFYSPRSRRDLEQPLVNGLQGSELDEMQVQSQAFQKR 8 3 

Qy 32 --GIVEQCCTSICSLYQLENYCN 52 

I I I I I I I : I I I I I I I I I I 
Db 84 KPGIVEQCCHNTCSLYDLENYCN 106 



RESULT 4 
Q9 8TA8 

ID Q98TA8 PRELIMINARY; PRT; 110 AA. 

AC Q98TA8; 

DT 01-JUN-2001 (TrEMBLrel . 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin . 

OS Pantodon buchholtzi (Butterf lyf ish) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Osteoglossomorpha; 

OC Osteoglossif ormes ; Pantodontidae; Pantodon. 

OX NCBI_TaxID=827 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21203577; PubMed=113 0617 1 ; 

RA Al-Mahrouki A. A., Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(2001). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY) . 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ I GF/RELAXIN FAMILY. 

DR EMBL; AF199588; AAK28712.1; 

DR HSSP; P01308; 1HIS. 

DR GO; GO: 0005576; C: extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00 07 8; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 110 AA; 12324 MW; BDECCD659D8 72E06 CRC64; 

Query Match 68.5%; Score 201.5; DB 13; Length 110; 

Best Local Similarity 43.5%; Pred. No. 7.7e-20; 

Matches 37; Conservative 8; Mismatches 5; Indels 35; Gaps 1 

Qy 3 NQHLCGSHLVEALYLVCGERGFFYTPKT 30 

: I I I I I I I I I : I II : I I I I : I I I I III 
Db 2 6 SQHLCGSHLVDALYMVCGEKGFFYQPKTKRDVDPLLGFLSPKSAQENEADEYPYKDQGDL 8 5 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I : : : I : I I I I 
Db 8 6 KVKRGIVEQCCHHPCNI FDLQNYCN 110 



RESULT 5 




Q9DDE5 




ID 


OQDDffS PRPT.TMTNARY * PRT ; 108 AA 




AC 


Q9DDE5; 




DT 


01-MAR-2001 (TrEMBLrel. 16, Created) 




DT 


01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 




DT 


01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 




\j j_j 


Insulin precursor. 




\jri 


INS. 




OS 


Brachydanio rerio (Zebrafish) (Danio rerio) . 






Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


nr 


Actinopterygii; Neopterygii; Teleostei; Os tariophysi ; Cyprinif ormes ; 


nr 
UL 


Cyprinidae; Danio. 




Ua 


NOBI laxiD- /yoo; 






r i i 
L l J 




p p 
Kir 


q T7 rM TTPM 17 TT'P^M "NT A 
o Hj\JU £jvi UHj r KUiYl lN./\. 




P. V 


MT^TlT TMTT 1 — Q QA 0 R 1 Q fi • PnV>MpH~1 D/IQ^^QI • 

jyi.tLiUij_Lr\J iii— z?i7*izo±_?u , t. ujjiyicca— _l u *± j o ~ ± , 




T> "A 

KA 


Argenton F. , Zecchin E. , Bortolussi M. ; 




Kl 


"Early appearance of pancreatic hormone-expressing 


cells m the 


DT 1 


zebrafish embryo."; 




DT 
KL 


Mech. Dev. 87:217-221(1999). 






-!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 






-!- SIMILARITY: BELONGS TO THE INSULIN/ I GF/RELAXIN 


FAMILY. 


UK 


EMBL; AJ237750; CAC20109.1; 




JJK 


HSSP; P01308; 1LPH. 




np 

Ur\ 


ZFIN; ZDB-GENE-980526-110; ins. 




TIP 


GO; GO:0005576; C: extracellular ; IEA. 




DR 


GO; GO: 0005179; F:hormone activity; IEA. 




UK 


GO; GO: 0007582; P : physiological processes; IEA. 




UK 


InterPro; IPR004 82 5; Ins /IGF/ relax . 




DR 


Pfam; PF00049; Insulin; 1. 




DR 


PRINTS; PR00277; INSULINB. 




DR 


SMART; SM00078; I1GF; 1. 




DR 


PROSITE; PS00262; INSULIN; 1. 




KW 


Signal . 




FT 


SIGNAL 1 23 POTENTIAL. 




FT 


CHAIN 24 53 INSULIN B CHAIN. 




FT 


CHAIN 86 108 INSULIN A CHAIN. 




SQ 


SEQUENCE 108 AA; 11904 MW; 3 1952 8 9E72AD6D2 5 CRC64 ; 



Query Match 66.5%; Score 195.5; DB 13; Length 108; 

Best Local Similarity 45.1%; Pred. No. 5.1e-19; 

37; Conservative 5; Mismatches 7; Indels 33; Gaps 1 

4 QHLCGSHLVEALYLVCGERGFFYTPK T 3 0 

I I I I I I I I I : I I I I I I I I I I I II 
27 QHLCGSHLVDALYLVCGPTGFFYNPKRDVEPLLGFLPPKSAQETEVADFAFKDHAELIRK 8 6 

31 RGI VEQCCT S I CSLYQLENYCN 52 

I I I I I II I | | : : : I : I | | | 
87 RGIVEQCCHKPCSIFELQNYCN 10 8 



Matches 

Qy 

Db 

Qy 

Db 



RESULT 6 
Q90ZN4 

ID Q90ZN4 PRELIMINARY; PRT; 108 AA. 



AC Q90ZN4; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin. 

OS Catla catla (catla) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi ; Cyprini formes ; 

OC Cyprinidae; Catla. 

OX NCBI_TaxID=7244 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Bhattacharya S., Roy S.S., Dasgupta S. f Ravikumar L., Mukherjee M., 

RA Bandyopadhyaya I., Wakabayasi K.; 

RT "A new cell secreting insulin. 11 ; 

RL Submitted (APR-2001) to the EMBL/GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/IGF/RELAXIN FAMILY. 

DR EMBL; AF373021; AAK51558.1; 

DR HSSP; P01308; 1LNP. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 108 AA; 11881 MW; D7 13 02 6E22EF5D59 CRC64; 

Query Match 66.5%; Score 195.5; DB 13; Length 108; 

Best Local Similarity 45.1%; Pred. No. 5.1e-19; 

37; Conservative 5; Mismatches 7; Indels 33; Gaps 1 

4 QHLCGSHLVEALYLVCGERGFFYTPK T 3 0 

I I I I I I I I I : I I I I I I I I I I I II 
27 QHLCGSHLVDALYLVCGPTGFFYNPKRDVDPLMGFLPPKSAQETEVADFAFKDHAEVIRK 8 6 

31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I | | :: : I : | I | I 
87 RGIVEQCCHKPCSIFELQNYCN 108 



Matches 

Qy 

Db 

Qy 

Db 



RESULT 7 
Q98TB0 

ID Q98TB0 PRELIMINARY; PRT; 111 AA. 

AC Q98TB0; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Chitala chitala (clown knifefish) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Osteoglossomorpha ; 

OC Osteoglossif ormes ; Notopteridae ; Chitala. 

OX NCBI_TaxID=112163; 

RN [1] 



RP SEQUENCE FROM N.A. 

RX MEDLINE-21203577; PubMed=l 13 0 617 1 ; 

RA Al-Mahrouki A. A. , Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(20 01). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ I GF/RELAXIN FAMILY. 

DR EMBL; AF199586; AAK28710.1; 

DR HSSP; P01308; 1LPH. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO:0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/lGF/relax. 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

FT NON_TER 111 111 

SQ SEQUENCE 111 AA; 12483 MW; 2 47CA44 3137 632 9F CRC64; 



Query Match 66.3%; 
Best Local Similarity 44.2%; 



Matches 



38; Conservative 



Score 195; DB 13; 
Pred. No. 6.1e-19; 
3; Mismatches 9; 



Length 111; 
Indels 36; 



Gaps 



1; 



Qy 

Db 



26 



NQHLCGSHLVEALYLVCGERGFFYTPK 2 9 

I I I I I I I I I I I I I I I I I I I I I M I M 

NQHLCGSHLVEALYLVCGERGFFYNPKMDKRDAEPLLGFLSPKSGLENEVDEYPFKDQGD 8 5 



Qy 

Db 



3 0 TRGIVEQCCTSICSLYQLENYCN 52 

I I I I II I I I : : : II I 

8 6 VKMKRGIVEQCCHRPCNIFDQNQYCN 111 



RESULT 8 
Q90ZY1 

ID Q90ZY1 PRELIMINARY; PRT; 110 AA. 

AC Q90ZY1; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Hiodon alosoides (goldeye) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Osteoglossomorpha ; 

OC Osteoglossif ormes ; Hiodontidae; Hiodon. 

OX NCBI_TaxID=54 904; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=212 03577; PubMed=1130617 1 ; 

RA Al-Mahrouki A. A., Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(2001). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ I GF/RELAXIN FAMILY. 

DR EMBL; AF282408; AAK54684.1; -. 

DR HSSP; P01308; 1LNP. 



DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

FT NON_TER 110 110 

SQ SEQUENCE 110 AA; 12343 MW; BDECCD7 7 03E52E06 CRC64; 

Query Match 65.8%; Score 193.5; DB 13; Length 110; 

Best Local Similarity 42.4%; Pred. No. 9.7e-19; 

Matches 36; Conservative 7; Mismatches 7; Indels 35; Gaps 1 

Qy 3 NQHLCGSHLVEALYLVCGERGFFYTPKT 30 

: I I I I I M II : I I I : I I I I : II I I III 
Db 2 6 SQHLCGSHLVDALYMVCGEKGFFYQPKTKRDVDPLLGFLSPKSAQENEADEYPYKDQGDL 8 5 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I ::: I III 
Db 8 6 KVKRGI VEQCCHRPCNI FDLNQYCN 110 



RESULT 9 
Q98TA7 

ID Q98TA7 PRELIMINARY; PRT; 111 AA. 

AC Q98TA7; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Osteoglossum bicirrhosum (silver arawana) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Osteoglossomorpha ; 

OC Osteoglossif ormes ; Osteoglossidae; Osteoglossum. 

OX NCBI_TaxID=109271; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 12 0357 7; PubMed=l 13 0 61 7 1 ; 

RA Al-Mahrouki A. A. , Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(2001). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/IGF/RELAXIN FAMILY. 

DR EMBL; AF199589; AAK28713.1; 

DR HSSP; P01315; IMP J. 

DR GO; GO:0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax. 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

FT NON TER 111 111 



SQ SEQUENCE 111 AA; 12491 MW; AC9E19D2D4 8 66D2 0 CRC64; 



Query Match 65.1%; Score 191.5; DB 13; Length 111; 

Best Local Similarity 41.2%; Pred. No. 1.8e-18; 

Matches 35; Conservative 10; Mismatches 5; Indels 35; Gaps 

Qy 3 NQHLCGSHLVEALYLVCGERGFFYTPKT 3 

: I I I I I I I I : I I I : I I I : II I II : I I : 
Db 27 SQRLCGSHLVDALYMVCGDRGFFYSPKSRREAEPLLGFLSPKSGQENEVDEYPYKEQGEL 8 

Qy 31 RGIVEQCCT S I CSLYQLENYCN 52 

I I I I I I I I I : : : I : I I I I 
Db 87 KVKRGIVEQCCHRPCNI FDLQNYCN 111 



RESULT 10 
Q98TA9 

ID Q98TA9 PRELIMINARY; PRT; 87 AA. 

AC Q98TA9; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Gnathonemus petersii. 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Osteoglossomorpha; 

OC Osteoglossif ormes ; Mormyridae ; Gnathonemus. 

OX NCBI_TaxID=42 64 5 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21203577; PubMed-1 13 0 617 1 ; 

RA Al-Mahrouki A. A. , Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(2001). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ IGF/ RE LAXIN FAMILY. 

DR EMBL; AF199587; AAK28711.1; 

DR HSSP; P01308; IMS. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/ relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM0007 8; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

FT NON_TER 1 1 

FT NON_TER 87 87 

SQ SEQUENCE 87 AA; 9874 MW; FF448ED35D2453F5 CRC64; 

Query Match 63.8%; Score 187.5; DB 13; Length 87; 

Best Local Similarity 42.9%; Pred. No. 5.1e-18; 

Matches 36; Conservative 5; Mismatches 8; Indels 35; Gaps 



Qy 4 QHLCGSHLVEALYLVCGERGFFYTPKT- 

I I I I I I I I II I I : I I I I I I I I I : I I 



Db 



4 QHLCGSHLVEALFLVCGERGFFFNPDTKRDVDSLLGFLSPKSGPENEADEYRYKEQAEVK 63 



Qy 31 --RGIVEQCCTSICSLYQLENYCN 52 

I I I I II I I I I HI 

Db 64 VKRGI VEQCCHHPCNI FDLNQYCN 87 



RESULT 11 
Q98TB1 

ID Q98TB1 PRELIMINARY; PRT; 108 AA. 

AC Q98TB1; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Catostomus commersoni (White sucker) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi ; Cyprinif ormes ; 

OC Catostomidae; Catostomus. 

OX NCBI_TaxID=7971; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21203577; PubMed=1130617 1 ; 

RA Al-Mahrouki A. A. , Irwin D.M., Graham L.C., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNAs from several 

RT osteoglossomorphs and a cyprinid."; 

RL Mol. Cell. Endocrinol. 174:51-58(2001). 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/IGF/RELAXIN FAMILY. 

DR EMBL; AF199585; AAK28709.1; 

DR HSSP; P01308; 1LPH. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

FT NONJTER 108 108 

SQ SEQUENCE 108 AA; 11873 MW; E42 63 1069 6FBAFC8 CRC64 ; 

Query Match 63.4%; Score 186.5; DB 13; Length 108; 

Best Local Similarity 43.9%; Pred. No. 8.7e-18; 

Matches 36; Conservative 4; Mismatches 9; Indels 33; Gaps 1 

Qy 4 QHLCGSHLVEALYLVCGERGFFYTPK T 30 

I I I I I I I I I : I I I I I I I I I I I II 
Db 27 QHLCGSHLVDALYLVCGPTGFFYNPKRDVDPLIGFLPPKSGPENEVADFAFKDHAELIRK 8 6 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I ::: II III 
Db 87 RGIVEQCCHRPCNIFDLEKYCN 10 8 



RESULT 12 
Q98TB2 



ID Q98TB2 PRELIMINARY; PRT; 91 AA. 

AC Q98TB2; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin (Fragment) . 

OS Ambloplites rupestris (Rock bass) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Euteleostei; Neoteleostei ; 

OC Acanthomorpha; Acanthopterygii ; Percomorpha ; Percif ormes ; Percoidei; 

OC Centrarchidae; Ambloplites. 

OX NCBI_TaxID=109273; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Al-Mahrouki A. A., Irwin D.M., Youson J.H.; 

RT "Molecular cloning of preproinsulin cDNA from the rock bass."; 

RL Submitted (OCT-1999) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/lGF/RELAXIN FAMILY. 

DR EMBL; AF199584; AAK28708.1; -. 

DR HSSP; P01308; 1LPH. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

FT NON_TER 1 1 

FT NON_TER 91 91 

SQ SEQUENCE 91 AA; 10100 MW; E8 6C8B25 6DC69D39 CRC64 ; 

Query Match 63.1%; Score 185.5; DB 13; Length 91; 
Best Local Similarity 40.9%; Pred. No. le-17; 

Matches 36; Conservative 5; Mismatches 8; Indels 39; Gaps 1; 

Qy 4 QHLCGSHLVEALYLVCGERGFFYTPK 2 9 

M I I I I I I I : I I I I I I I : I I I I I II 

Db 4 QHLCGSHLVD7VLYLVCGDRGFFYNPKRDVDPLMGFLPPKADGAAAPGGENEVAEFAFKDQ 63 

Qy 30 TRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I : : : I III 

Db 64 MEMMVKRGIVEQCCHHPCNI FDLGRYCN 91 



RESULT 13 
Q8HZ81 

ID Q8HZ81 PRELIMINARY; PRT; 65 AA. 

AC Q8HZ81; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Insulin (Fragment) . 

OS Gorilla gorilla (gorilla) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Gorilla. 



OX NCBI_TaxID=9593; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA 0 ! hUigin C, Tichy H., Klein J.; 

RT "Molecular evolution in higher primates; gene specific and organism 

RT specific characteristics."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AY092023; AAM76640.1; 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR SMART; SM00078; I IGF; 1. 

FT NON_TER 1 1 

FT NONJTER 65 65 

SQ SEQUENCE 65 AA; 6920 MW; B772 017 FD8BCABEA CRC64 ; 

Query Match 49.7%; Score 146; DB 6; Length 65; 

Best Local Similarity 47.7%; Pred. No. 1.9e-12; 

Matches 31; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 7 CGSHLVEALYLVCGERGFFYTPKT RG 32 

I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 1 CGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRG 60 

Qy 33 IVEQC 37 

I I I I I 

Db 61 IVEQC 65 



RESULT 14 
Q8HZ80 
ID 
AC 



DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
FT 



Q8HZ80 
Q8HZ80; 
01-MAR-2003 
01-MAR-2003 
01-JUN-2003 



PRELIMINARY; 



PRT; 



65 AA. 



Created) 

Last sequence update) 
Last annotation update) 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Pongo. 



(TrEMBLrel. 23, 
(TrEMBLrel. 23, 
(TrEMBLrel. 24, 
Insulin (Fragment) . 
Pongo pygmaeus (Orangutan) . 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates ; 
NCBI_TaxID=9600; 
[1] 

SEQUENCE FROM N.A. 

O'hUigin C, Tichy H., Klein J.; 

"Molecular evolution in higher primates; gene specific and organism 
specific characteristics."; 

Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases. 

EMBL; AY092024; AAM76641.1; 

GO; GO: 0005576; C : extracellular ; IEA. 

GO; GO: 0005179; F:hormone activity; IEA. 

GO; GO: 0007582; P : physiological processes; IEA. 

InterPro; IPR004 825; Ins /IGF/ relax . 

Pfam; PF00049; Insulin; 1. 

SMART; SM00078; I1GF; 1. 

NON TER 1 1 



FT NON_TER 65 65 

SQ SEQUENCE 65 AA; 6920 MW; B7 72 017FD8BCABEA CRC64; 



Query Match 49.7%; Score 146; DB 6; Length 65; 

Best Local Similarity 47.7%; Pred. No. 1.9e-12; 

Matches 31; Conservative 0; Mismatches 0; Indels 34; Gaps 

Qy 7 CGSHLVEALYLVCGERGFFYTPKT — RG 

I I I I I I I II I I I I I I I I I I I I I I I II 
Db 1 CGSHLVEALYLVCGERGFFYTPKT RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRG 

Qy 33 IVEQC 37 

I I I I I 

Db 61 IVEQC 65 



RESULT 15 
Q90XD0 

ID Q90XD0 PRELIMINARY; PRT; 2 07 AA. 

AC Q90XD0; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Preproinsulin-like growth factor-II. 

GN IGF-II. 

OS Cyprinus carpio (Common carp) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi ; Cyprinif ormes ; 

OC Cyprinidae; Cyprinus. 

OX NCBI_TaxID=7 962; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Tse M.C.L., Chan K.M. , Cheng C.H.K.; 

RT "PCR-cloning and Gene Expression Studies on Common Carp (Cyprinus 

RT carpio) Insulin-like Growth Factor-II."; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: SECRETED (BY SIMILARITY). 

CC -!- SIMILARITY: BELONGS TO THE INSULIN/ IGF/RELAXIN FAMILY. 

DR EMBL; AF402958; AAL25799.1; 

DR HSSP; P01308; 1LNP. 

DR GO; GO: 0005576; C : extracellular ; IEA. 

DR GO; GO: 0005179; F:hormone activity; IEA. 

DR GO; GO: 0007582; P : physiological processes; IEA. 

DR InterPro; IPR004825; Ins/IGF/relax. 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

SQ SEQUENCE 207 AA; 23869 MW; 44FBA2871A361862 CRC64; 

Query Match 49.0%; Score 144; DB 13; Length 207; 
Best Local Similarity 48.2%; Pred. No. 1.2e-ll; 

Matches 27; Conservative 8; Mismatches 11; Indels 10; Gaps 

Qy 6 LCGSHLVEALYLVCGERGFFYTPKT RGIVEQCCTSICSLYQLENYC 51 

III Ihll I II : I I I : : : I I I I I I : I I : I : I I I I I 

Db 53 LCGGELVDALQFVCGDRGFYFSRPTSRLSSRRSQNRGIVEECCFNSCNLALLEQYC 108 



Search completed: July 15, 2004, 16:41:00 
Job time : 21.9204 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: July 15, 2004, 16:28:49 



Search time 3.58955 Seconds 
(without alignments) 
754.314 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-423-100-5 
294 

1 FVNQHLCGSHLVEALYLVCG IVEQCCTS I CSLYQLENYCN 52 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



141681 



Database : 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
INS_BALPH 

ID INS_BALPH STANDARD; PRT; 51 AA. 

AC P01312; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 21-JUL-1986 (Rel. 01, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin. 

GN INS . 

OS Balaenoptera physalus (Finback whale) (Common rorqual), and 

OS Physeter catodon (Sperm whale) (Physeter macrocephalus ) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Cetacea; Mysticeti; 

OC Balaenopteridae ; Balaenoptera. 

OX NCBI_TaxID=9770, 9755; 

RN [1] 

RP PARTIAL SEQUENCE . 

RC SPECIES=B. physalus; 

RA Hama H., Titani K., Sakaki S., Narita K. ; 

RT "The amino acid sequence in fin-whale insulin."; 

RL J. Biochem. 56:2 85-2 93(1964). 

RN [2] 

RP SEQUENCE. 

RC SPECIES=P. catodon; 



RA Ishihara Y., Saito T., Ito Y. , Fujino M. ; 

RT "Structure of sperm- and sei-whale insulins and their breakdown by 

RT whale pepsin."; 

RL Nature 181:14 68-1469(1958). 

RN [3] 

RP SEQUENCE . 

RC SPECIES=P. catodon; 

RA Harris J.I., Sanger F. , Naughton M.A. ; 

RT "Species differences in insulin."; 

RL Arch. Biochem. Biophys. 65:427-438(1956). 

CC -!- FUNCTION: Insulin decreases blood glucose concentration. It 
CC increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the insulin family. 



DR 


PIR; A91918; INWHF. 


DR 


PIR; A93142; INWHP . 


DR 


HSSP; P01317; 1APH. 


DR 


InterPro 


; IPR004825; Ins/IGF/relax . 


DR 


PRINTS; 


PR00277; INSULINB. 


DR 


SMART; SM00 07 8; I1GF; 1. 


DR 


PROSITE; 


PS00262; INSULIN; 1. 


KW 


Insulin 


family; Hormone; Glucose metabolism. 


FT 


CHAIN 


1 30 INSULIN B CHAIN. 


FT 


NON_CONS 


30 31 


FT 


CHAIN 


31 51 INSULIN A CHAIN. 


FT 


DISULFID 


7 37 INTERCHAIN. 


FT 


DISULFID 


19 50 INTERCHAIN. 


FT 


DISULFID 


36 41 


SQ 


SEQUENCE 


51 AA; 5766 MW; 9007B5 14 691A7CDD 



Query Match 93.0%; Score 273.5; DB 1; Length 51; 

Best Local Similarity 96.2%; Pred. No. 1.9e-27; 

Matches 50; Conservative 0; Mismatches 1; Indels 1; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I I I I I I I 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCTSICSLYQLENYCN 51 



RESULT 2 
INS_ELEMA 

ID INS_ELEMA STANDARD; PRT; 51 AA. 

AC P01316; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 21-JUL-1986 (Rel. 01, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin. 

GN INS . 

OS Elephas maximus (Indian elephant) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Proboscidea; Elephantidae; Elephas. 

OX NCBI_TaxID=97 83; 

RN [1] 



RP SEQUENCE . 

RX MEDLINE=66160119; PubMed=594 9593 ; 

RA Smith L. F. ; 

RT "Species variation in the amino acid sequence of insulin."; 

RL Am. J. Med. 4 0:662-666(1966). 

CC -!~ FUNCTION: Insulin decreases blood glucose concentration. It 
CC increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- MISCELLANEOUS: THE SPECIES OF ELEPHANT IS NOT GIVEN, BUT IT IS 
CC MOST PROBABLY THE INDIAN ELEPHANT (ELEPHAS MAXIMUS ) . 

CC -!- SIMILARITY: Belongs to the insulin family. 



DR 


HSSP; P01308; 1AI0. 




DR 


InterPro 


; IPR004 


825; Ins/I GF/ relax. 


DR 


PRINTS; 


PRQ0277; 


INSULINB. 




DR 


SMART; SM00078; 


I1GF; 1. 




DR 


PROSITE; 


PS00262 


; INSULIN; 1, 




KW 


Insulin 


family; 


Hormone; Glucose metabolism. 


FT 


CHAIN 


1 


30 


INSULIN B CHAIN. 


FT 


NON_CONS 


30 


31 




FT 


CHAIN 


31 


51 


INSULIN A CHAIN. 


FT 


DISULFID 


7 


37 


INTERCHAIN. 


FT 


DISULFID 


19 


50 


INTERCHAIN. 


FT 


DISULFID 


36 


41 




SQ 


SEQUENCE 


51 AA; 5752 MW; 


9007B50CDB457D6D 



Query Match 93.0%; Score 273.5; DB 1; Length 51; 

Best Local Similarity 94.2%; Pred. No. 1.9e-27; 

Matches 49; Conservative 1; Mismatches 1; Indels 1; Gaps 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : II I I I I I I I II 
Db 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT-GIVEQCCTGVCSLYQLENYCN 51 



RESULT 3 
INS ACOCA 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RT 
RT 



(Rel, 
(Rel. 
(Rel, 



01, Created) 

01, Last sequence update) 
42, Last annotation update) 



INS_ACOCA STANDARD; PRT; 51 AA. 

P01324; 
21-JUL-1986 
21-JUL-1986 
10-OCT-2003 
Insulin . 
INS. 

Acomys cahirinus (Egyptian spiny mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Acomys. 
NCBI_TaxID=10068; 
[1] 

COMPOSITION. 

MEDLINE=72189454; PubMed-502 82 10 ; 
Buenzli H.F., Humbel R.E.; 

"Isolation and partial structural analysis of insulin from mouse (Mus 
musculus) and spiny mouse (Acomys cahirinus)."; 



RL 
CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



Hoppe-Seyler's Z. Physiol. Chem. 353:444-450(1972) 

-r . . _t_^^^^c-^o K nnH rflucose COnC 



i • ^ rr p^ps blood alucose concentration, it 
FUNCTION: ^^^J to monoaaccharides, a m ino acids and 
fat^ac^ 6 It'ac^erates glycolysis, the pentose phosphate 

cycle, and glycogen synt he ^ « A chain linked by two 

-!- SUBUNIT: Heterodimer of a B cnain ana an ^ 

disulfide bonds, 
-i- SUBCELLULAR LOCATION: Secreted. 
-!- SIMILARITY: Belongs to the insulin family. 
PIR; A01591; INMSSP. 
HSSP; P01308; ITYM. 

InterPro; IPR004825; Ins/IGF/relax . 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 

insulin family; Hormone; Glucose ^tabolism. 

INSULIN B CHAIN. 



CHAIN 
NONJCONS 
CHAIN 
DISULFID 
DISULFID 
DISULFID 
SEQUENCE 



1 
30 
31 
7 
19 
36 
51 AA; 



30 
31 
51 
37 
50 
41 

5768 MW; 



INSULIN A CHAIN. 
INTERCHAIN (BY SIMILARITY) . 
INTERCHAIN (BY SIMILARITY) . 
BY SIMILARITY. 
992BD8B629047D3D CRC64 ; 



Query Match 
Best Local Similarity 
Matches 48, 



91.3%; 
92.3%; 
Conservative 



Score 268.5; DB 1; 
Pred. No. 7.8e-27; 
3; Mismatches 0; 



Length 51; 
Indels 



1; Gaps 



52 



Qy 

Db 



1 FWQHLCGSHLVEALYLVCGERGFFYTPKTRGIV^ 
1 ^B^CGsiiiFJvLYivCGERGFFYTPKS-GIVDQCCTSICSLYQLENYCN 51 



RESULT 4 

INS CERAE 110 AA 

— ^™-*Tn cTM^ziRn: PRT; liu ^ JX - 



INS_CERAE STANDARD; PRT; 

AC P30407; P01309; 

DT 01-APR-1993 (Rel. 25, Created) 

DT 01-APR-1993 (Rel. 25, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin precursor. 

OS Cercopithecus aethiops (Green ^y^!^^^, Eute leosto m i ; 

nr Fukarvota; Metazoa; Chordata; Craniata, veneuiau , 

OC S^Sa- Eutheria; Primates; Catarrhini; Cercop.thecxdae ; 

OC Cercopithecinae; Cercopithecus . 

OX NCBI_TaxID=9534; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE=92219953; PubMed=15 607 57 ; 

"stauen^s'ff'pri^e'in^iin genes support the hypothesis of a 
sSer rate of raolecular evolution in humans and apes than in 

RT monkeys . " ; 

RL Mol. Biol. Evol. 9:193-203(1992). 
RN [2] 

RP SEQUENCE OF 57-87. 



RA 
RT 
RT 



RX 

RA 

RT 

RT 

RT 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



MEDLINE=722 58016; PubMed=4 62 6369 ; 

Peterson J.D., Nehrlich S., Oyer P.E., Steiner D.F.; 

"Determination of the amino acid sequence of the monkey, sheep, and 
dog proinsuiin C-peptides by a semi-micro Edman degradation 
procedure. "; 

J. Biol. Chem. 247:4866-4871(1972). 

-!- FUNCTION: Insulin decreases blood glucose concentration. It 

increases cell permeability to monosaccharides, amino acids and 
fatty acids. It accelerates glycolysis, the pentose phosphate 
cycle, and glycogen synthesis in liver. 

-!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
disulfide bonds. 

-!- SUBCELLULAR LOCATION: Secreted. 

-!- SIMILARITY: Belongs to the insulin family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; X61092; CAA43405.1; -. 
PIR; B42179; B42179. 
HSSP; P01308; 1AI0. 

InterPro; IPR004825; Ins/IGF/relax. 
Pfam; PF00049; Insulin; 1. 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 

Insulin family; Hormone; Glucose metabolism; Signal. 



SIGNAL 

CHAIN 

PROPEP 

CHAIN 

DISULFID 

DISULFID 

DISULFID 

SEQUENCE 



1 
25 
57 
90 
31 
43 
95 
110 AA; 



24 

54 

87 
110 

96 
109 
100 

12019 MW; 



INSULIN B CHAIN. 
C PEPTIDE. 
INSULIN A CHAIN. 
INTERCHAIN . 
INTERCHAIN . 

95A1F54BE7B247F9 CRC64; 



Query Match 90 . 8%; 

Best Local Similarity 60.5%; 
Matches 52 ; Conservative 



Score 267; DB 1; Length 110; 
Pred. No. 2.6e-26; 
0; Mismatches 0; Indels 34; 



Gaps 



1; 



QY 
Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYT PKT 

I I I I I I I I I I II II I I I I I M I I I I I I I M 
25 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRRE7VEDPQVGQVELGGGPGAGSLQPLALEG 



30 



84 



QY 
Db 



31 



85 



RGIVEQCCTS I CSLYQLENYCN 52 

II I I I I I I I I I I I M I I I I I I I 
SLQKRGIVEQCCTSICSLYQLENYCN 110 



RESULT 5 
INS_HUMAN 

ID INS HUMAN STANDARD; PRT; 110 AA. 



AC 


rUloUo ; 






DT 


21-JUL-1986 (Rel. 01, Created) 






DT 


21-JUL-1986 (Rel. 01, Last sequence update) 






DT 


1 b -MAR— Z U (J 4 (Rel. 4j, Last annOLdLion upudiej 






DE 


Insulin precursor. 






GN 


TlkTC 






OS 


Homo sapiens (Human) . 






oc 


Eukaryota; Metazoa; Choraata; Craniata, veri:eDrai.a, dULeieobuumi/ 






oc 


Mammalia; Eutneria; ^rimaues; Lauarinnu/ nuiLu.u-Lu.ac:/' h^ul^ . 






ox 


NCBI TaxID=9oUb; 






RN 


[1] 






RP 


SEQUENCE r ROM N.A. 






RX 


MEDLINE— o L) 1Z U 1 Zd , c uoiYiea— oz ^ 01*10, 






RA 


Beli G . 1 . , FlCtSt K . .Li • , KUtCcI W . U . , ^Uiucii J-> • r iiouici 






RA 


Goodman H.M. ; 






RT 


"Sequence of the human insulin gene."; 






RL 


Nature 2 84:26-32(1980). 






RN 


[2] 






RP 


SEQUENCE h ROM N.A. 






RX 


MEDLINE=8 023 631o ; PuDMea— oZ4oyoz , 






RA 


Ullrich A., Dull T.J., Gray A., Brosius J., Sures I.; 






RT 


Genetic variation m tne numan insulin gent;. , 






RL 


Science 209:612-615(198 0). 






RN 


[3] 






RP 


SEQUENCE FROM N.A. 






RX 


MEDLINE=80054779; PubMed=503234 ; 






RA 


Bell G.I., Swain W.F., Pictet R.L., Cordell B . , Goodman H.M., 






RA 


Rutter W.J. ; 






RT 


"Nucleotide sequence of a cDNA clone encoding human preproinsulin . 


•I . 

r 


RL 


Nature 282:525-527(197 9). 






RN 


[4] 






RP 


SEQUENCE FROM N.A. 






RX 


MEDLINE=8 01474 17; PubMed=6927 84 0 ; 






RA 


Sures I., Goeddel D.V., Gray A., Ullrich A. ; 






RT 


"Nucleotide sequence of human preproinsulin complementary DNA. " ; 






RL 


Science 208:57-59(1980). 






RN 


[5] 






RP 


SEQUENCE FROM N.A. 






RX 


MEDLINE-93364428; PubMed=83584 4 0 ; 






RA 


Lucassen A.M. , Bell J.I., Julier C, Lathrop M. ; 






RT 


"Susceptibility to insulin dependent diabetes mellitus maps to a 


4 


. 1 


RT 


kb segment of DNA spanning the insulin gene and associated VNTR. 






RL 


Nat. Genet. 4:305-310(1993). 






RN 


[6] 






RP 


SEQUENCE FROM N.A. 






RC 


TISSUE=Pancreas ; 






RX 


MEDLINE=2 23 8 8257; PubMed=l 2 477932; 






RA 


Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 






RA 


Klausner R.D., Collins F.S., Wagner L . , Shenmen CM., Schuler G. 


D. 


r 


RA 


Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K. 


r 




RA 


Hopkins R.F., Jordan H., Moore T . , Max S.I., Wang J., Hsieh F . , 






RA 


Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M. , Hong L., 






RA 


Stapleton M. f Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz 


T. 


E. , 


RA 


Brownstein M.J., Usdin T.B. f Toshiyuki S., Carninci P., Prange C 


■ • f 




RA 


Raha S.S., Loquellano N.A. f Peters G.J., Abramson R.D., Mullahy 


S. 


J., 


RA 


Bosak S.A., McEwan P.J. f McKernan K.J., Malek J. A., Gunaratne P. 


H. 




RA 


Richards S., Worley K.C., Hale S., Garcia A.M. , Gay L.J., Hulyk 


S. 


w., 



RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X. f Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A. r Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A. f Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:168 99-16903(2 002). 

RN [7] 

RP SEQUENCE OF 1-59 FROM N.A. 

RC TISSUE=Blood; 

RA Fajardy Weill J. J., Stuckens C.C., Danze P.M. P.; 

RT "Description of a novel RFLP diallelic polymorphism (-127 Bsgl C/G) 

RT within the 5' region of insulin gene."; 

RL Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases. 

RN [8] 

RP SEQUENCE OF 25-54 AND 90-110. 

RA Nicol D.S.H.W., Smith L.F.; 

RT "Amino-acid sequence of human insulin."; 

RL Nature 187:4 83-4 85(1960). 

RN [9] 

RP SEQUENCE OF 57-87. 

RX MEDLINE=71116410; PubMed=5 10177 1 ; 

RA Oyer P.E., Cho S., Peterson J.D., Steiner D.F.; 

RT "Studies on human proinsulin. Isolation and amino acid sequence of 

RT the human pancreatic C-peptide . " ; 

RL J. Biol. Chem. 24 6:1375-1386(1971). 
RN [10] 

RP SEQUENCE OF 57-87. 

RX MEDLINE=71257722; PubMed=5560404 ; 

RA Ko A., Smyth D.G., Markussen J., Sundby F. ; 

RT "The amino acid sequence of the C-peptide of human proinsulin."; 

RL Eur. J. Biochem. 20:190-199(1971). 

RN [11] 

RP SYNTHESIS. 

RX MEDLINE=75077277; PubMed=4 4 432 93 ; 

RA Sieber P . , Kamber B., Hartmann A., Joehl A., Riniker B., Rittel W. ; 
RT "Total synthesis of human insulin under directed formation of the 
RT disulfide bonds."; 

RL Helv. Chim. Acta 57:2 617-2 621(1974). 
RN [12] 

RP SYNTHESIS OF 57-87. 

RX MEDLINE=75040007; PubMed=4 8 035 04 ; 

RA Naithani V.K. ; 

RT "Studies on polypeptides, IV. The synthesis of C-peptide of human 
RT proinsulin."; 

RL Hoppe-Seyler ' s Z. Physiol. Chem. 354:659-672(1973). 
RN [13] 

RP SYNTHESIS OF 65-69 AND 70-73. 

RX MEDLINE=73161263; PubMed=4 69 8555 ; 

RA Geiger R. , Volk A.; 

RT "Synthesis of peptides with the properties of human proinsulin C 
RT peptides (hC peptide). 3. Synthesis of the sequences 14-17 and 9-13 
RT of human proinsulin C peptides."; 
RL Chem. Ber. 106:199-205(1973). 



RN [14] 

RP SYNTHESIS OF 84-87. 

RX MEDLINE=73161261; PubMed=4 698553 ; 

RA Geiger R. , Jaeger G. , Keonig W., Treuth G. ; 

RT "Synthesis of peptides with the properties of human proinsulin C 

RT peptides (hC peptide). I. Scheme for the synthesis and preparation of 

RT the sequence 28-31 of human proinsulin C peptide."; 

RL Chem. Ber. 106:18 8-192(1973). 

RN [15] 

RP VARIANT LOS ANGELES SER-48. 

RX MEDLINE=8 4 016053; PubMed=63124 55 ; 

RA Haneda M. , Chan S.J., Kwok S.C.M., Rubenstein A.H., Steiner D.F.; 

RT "Studies on mutant human insulin genes: identification and sequence 

RT analysis of a gene encoding [ SerB24 ] insulin . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 80:6366-637 0(1983). 

RN [16] 

RP VARIANTS LOS ANGELES SER-4 8 AND CHICAGO LEU-4 9. 

RX MEDLINE=8 417 0233; PubMed=6424111 ; 

RA Shoelson S., Fickova M., Haneda M. , Nahum A., Musso G. , Kaiser E.T., 

RA Rubenstein A.H., Tager H.; 

RT "Identification of a mutant human insulin predicted to contain a 

RT serine-f or-phenylalanine substitution. "; 

RL Proc. Natl. Acad. Sci. U.S.A. 80:7390-7394(1983). 

RN [17] 

RP VARIANT PROVIDENCE ASP-34 . 

RX MEDLINE=87175640; PubMed=34 7 07 84 ; 

RA Chan S.J., Seino S., Gruppuso P. A., Schwartz R. , Steiner D.F.; 

RT "A mutation in the B chain coding region is associated with impaired 

RT proinsulin conversion in a family with hyperproinsulinemia . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 84:2194-2197(1987). 

RN [18] 

RP VARIANT WAKAYAMA LEU-92 . 

RX MEDLINE=87 058122; PubMed=3537 011 ; 

RA Sakura H., Iwamoto Y., Sakamoto Y., Kuzuya T., Hirata H. ; 

RT "Structurally abnormal insulin in a diabetic patient. Characterization 

RT of the mutant insulin A3 (Val — >Leu) isolated from the pancreas."; 

RL J. Clin. Invest. 78:1666-1672(1986). 

RN [19] 

RP VARIANT HIS-89. 

RX MEDLINE=90317021; PubMed=2 1 96279 ; 

RA Barbetti F. , Raben N., Kadowaki T . , Cama A. , Accili D., Gabbay K.H., 

RA Merenich J. A., Taylor S.I., Roth J.; 

RT "Two unrelated patients with familial hyperproinsulinemia due to a 

RT mutation substituting histidine for arginine at position 65 in the 

RT proinsulin molecule: identification of the mutation by direct 

RT sequencing of genomic deoxyribonucleic acid amplified by polymerase 

RT chain reaction."; 

RL J. Clin. Endocrinol. Metab . 71:164-169(1990). 

RN [20] 

RP VARIANT HIS-89. 

RX MEDLINE=85261996; PubMed=4 0197 86; 

RA Shibasaki Y., Kawakami T., Kanazawa Y . , Akanuma Y., Takaku F. ; 

RT "Posttranslational cleavage of proinsulin is blocked by a point 

RT mutation in familial hyperproinsulinemia."; 

RL J. Clin. Invest. 76:37 8-380(1985). 

RN [21] 

RP VARIANT KYOTO LEU-8 9. 



RX MEDLINE-922 913 07; PubMed=1601997 ; 

RA Yano H., Kitano N., Morimoto M. , Polonsky K.S., Imura H., Seino Y. ; 

RT "A novel point mutation in the human insulin gene giving rise to 

RT hyperproinsulinemia (proinsulin Kyoto)."; 

RL J. Clin. Invest. 89:1902-1907(1992). 

RN [22] 

RP STRUCTURE BY NMR. 

RX MEDLINE=91104966; PubMed=227 1 664 ; 

RA Hua Q.-X., Weiss M.A. ; 

RT "Toward the solution structure of human insulin: sequential 2D 1H NMR 

RT assignment of a des-pentapeptide analogue and comparison with crystal 

RT structure."; 

RL Biochemistry 29:10545-10555(1990). 

RN [23] 

RP STRUCTURE BY NMR. 

RX MEDLINE=91242467; PubMed=2036420 ; 

RA Hua Q.-X., Weiss M.A. ; 

RT "Comparative 2D NMR studies of human insulin and des-pentapeptide 

RT insulin: sequential resonance assignment and implications for protein 

RT dynamics and receptor recognition."; 

RL Biochemistry 30:5505-5515(1991). 

RN [24] 

RP STRUCTURE BY NMR. 

RX MEDLINE=91265527; PubMed=164 6635 ; 

RA Hua Q.-X., Weiss M.A. ; 

RT "Two-dimensional NMR studies of Des- (B26-B30) -insulin : sequence- 

RT specific resonance assignments and effects of solvent composition."; 

RL Biochim. Biophys. Acta 1078:101-110(1991). 

Query Match 90.8%; Score 267; DB 1; Length 110; 

Best Local Similarity 60.5%; Pred. No. 2.6e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 30 

I I I I I I I I I I I I I I I I II I I I I I I I I I I M 
D b 2 5 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEG 8 4 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I II I I I I I I I I I I 
Db 85 SLQKRGIVEQCCTSICSLYQLENYCN 110 



RESULT 6 
INS_MACFA 

ID INS_MACFA STANDARD; PRT; 110 AA. 

AC P30406; P01309; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 13-AUG-1987 (Rel. 05, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin precursor. 

GN INS . 

OS Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae ; 

OC Cercopithecinae; Macaca. 

OX NCBI_TaxID=9541; 

RN [1] 



RP SEQUENCE FROM N.A. 

RX MEDLINE=83080474; PubMed=61842 62 ; 

RA Wetekam W\, Groneberg J . , Leineweber M. , Wengenmayer F., 

RA Winnacker E.-L.; 

RT "The nucleotide sequence of cDNA coding for preproinsulin from the 

RT primate Macaca f ascicularis . " ; 

RL Gene 19:179-183 (1982) . 

cc _t_ FUNCTION : Insulin decreases blood glucose concentration. It 
CC increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

cc _(_ suBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the insulin family. 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; J00336; AAA36849.1; 

DR PIR; JQ0178; JQ0178. 

DR HSSP; P01308; 1AI0. 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

KW Insulin family; Hormone; Glucose metabolism; Signal. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



SIGNAL 


1 


24 






CHAIN 


25 


54 


INSULIN B CHAIN. 




PROPEP 


57 


87 


C PEPTIDE. 




CHAIN 


90 


110 


INSULIN A CHAIN. 




DISULFID 


31 


96 


INTERCHAIN. 




DISULFID 


43 


109 


INTERCHAIN. 




DISULFID 


95 


100 






) SEQUENCE 


110 AA; 


11991 MW 


; 83C6E33A80A420F9 


CRC64; 


Query Match 




90. 8%; 


Score 2 67; DB 1; 


Length 110; 


Best Local Similarity 


60.5%; 


Pred. No. 2.6e-26; 




Matches 52; 


Conservative 


0; Mismatches 0; 


Indels 34; 



l; 



Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKT 30 

I I I I I I I I I I I I I I II I I I I I I I I I I I I M 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYT PKTRREAEDPQVGQVELGGGPGAGSLQPLALEG 8 4 

Qy 31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I M I 
Db 85 SLQKRGIVEQCCTSICSLYQLENYCN 110 



RESULT 7 
INS PANTR 



ID INS_PANTR STANDARD; PRT; 110 AA. 

AC P30410; 

DT 01-APR-1993 (Rel. 25, Created) 

DT 01-APR-1993 (Rel. 25, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Insulin precursor. 

GN INS . 

OS Pan troglodytes (Chimpanzee) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Pan. 

OX NCBI_TaxID=9598 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92219953; PubMed=1560757 ; 

RA Seino S., Bell G.I., Li W. ; 

RT "Sequences of primate insulin genes support the hypothesis of a 

RT slower rate of molecular evolution in humans and apes than in 

RT monkeys."; 

RL Mol. Biol. Evol. 9:193-203(1992). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22833521; PubMed=12 952 8 7 8 ; 

RA Stead J.D., Hurles M.E., Jeffreys A. J.; 

RT "Global haplotype diversity in the human insulin gene region."; 

RL Genome Res. 13:2101-2111(2003). 

cc _i_ FUNCTION: Insulin decreases blood glucose concentration. It 
CC increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

cc _;_ SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the insulin family. 

CC 7~" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X61089; CAA43403.1; 

DR EMBL; AY137497; AAN06933.1; -. 

DR PIR; A42179; A42179. 

DR PDB; 1EFE; 2 9-MAR-0 0. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

KW Insulin family; Hormone; Glucose metabolism; Signal; 3D-structure . 

FT SIGNAL 1 24 

FT CHAIN 2 5 54 INSULIN B CHAIN. 

FT PROPEP 57 87 C PEPTIDE . 

FT CHAIN 90 110 INSULIN A CHAIN. 

FT DISULFID 31 96 INTERCHAIN. 



FT DISULFID 43 109 INTERCHAIN. 

FT DISULFID 95 100 

SQ SEQUENCE 110 AA; 12025 MW; 41EB8DF79837CEF5 CRC64; 

Query Match 90.8%; Score 267; DB 1; Length 110; 

Best Local Similarity 60.5%; Pred. No. 2.6e-26; 

Matches 52; Conservative 0; Mismatches 0; Indels 34; Gaps 1, 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 30 

I M | I I I I M I II I II I I I I II I I I II II I 

Db 25 FVNQHLCGS 

QY 



HLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEG 8 4 



31 RGIVEQCCTSICSLYQLENYCN 52 

I I I II I I I I I I I I I I I I I I I I I 
Db 85 SLQKRGIVEQCCTSICSLYQLENYCN 110 



RESULT 8 
INS_BALBO 

ID INS_BALBO STANDARD; PRT; 51 AA. 

AC P01314; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 21-JUL-1986 (Rel. 01, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin. 

GN INS . 

OS Balaenoptera borealis (Sei whale) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Cetacea; Mysticeti; 

OC Balaenopteridae; Balaenoptera. 

OX NCBI_TaxID=97 68 ; 

RN [1] 

RP SEQUENCE. 

RA Ishihara Y., Saito T., Ito Y. , Fujino M. ; m . , 

RT "Structure of sperm- and sei-whale insulins and their breakdown by 

RT whale pepsin. "; 

RL Nature 181:1468-1469(1958) . 

CC FUNCTION: Insulin decreases blood glucose concentration. It 

cc ' increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 

CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the insulin family. 

DR PIR; A01582; INWH1S. 

DR HSSP; P01317; 1APH . 

DR InterPro; IPR004825; Ins/lGF/relax . 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM00078; IlGF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

KW Insulin family; Hormone; Glucose metabolism. 
FT 
FT 
FT 
FT 
FT 



CHAIN 


1 


30 


INSULIN B CHAIN 


NON_CONS 


30 


31 




CHAIN 


31 


51 


INSULIN A CHAIN 


DISULFID 


7 


37 


INTERCHAIN. 


DISULFID 


19 


50 


INTERCHAIN. 



FT DISULFID 36 41 

SQ SEQUENCE 51 AA; 5723 MW; 9 0 07B50E4 O0A7DDD CRC64; 

Query Match 89.6%; Score 263.5; DB 1; Length 51; 

Best Local Similarity 92.3%; Pred. No. 3.3e-26; 

Matches 48; Conservative 0; Mismatches 3; Indels 1; Gaps 
Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYT PKTRGI VEQCCT S I CS LYQLENYCN 52 

| | | I I I I M I I I I M I I M I M I M I II I I I I I I I I I I I I I I II I I I I 

Db 1 FWQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCASTCSLYQLENYCN 51 



RESULT 9 
INS CAMDR 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RA 



RT 
RT 
RL 
CC 

cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
KW 



(Rel. 01, Created) 

(Rel. 01, Last sequence update) 

(Rel. 42, Last annotation update) 



(Dromedary) (Arabian camel) . 

Chordata; Craniata; Vertebrata; Euteleostomi ; 
Cetartiodactyla; Tylopoda; Camelidae; Camelus . 



INS_CAMDR STANDARD; PRT; 51 AA. 

P01320; 
21-JUL-1986 
21-JUL-1986 
10-OCT-2003 
Insulin. 
INS. 

Camelus dromedarius 
Eukaryota; Metazoa; 
Mammalia ; Eutheria ; 
NCBI_TaxID=9838; 

[1] 

SEQUENCE. 
Danho W.O. ; 

"The isolation and characterization of insulin of camel (Camelus 
dromedarius ) . " ; 

J. Fac. Med. Baghdad 14:16-2 8(1972). 

-!- FUNCTION: Insulin decreases blood glucose concentration. It 

increases cell permeability to monosaccharides, amino acids and 
fatty acids. It accelerates glycolysis, the pentose phosphate 
cycle, and glycogen synthesis in liver. 

-!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
disulfide bonds. 

-!- SUBCELLULAR LOCATION: Secreted. 

-!- SIMILARITY: Belongs to the insulin family. 

PIR; A92782; INCMA. 

HSSP; P01317; 2INS. 

InterPro; IPR004825; Ins/IGF/ relax. 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 

Insulin family; Hormone; Glucose metabolism. 



FT 


CHAIN 


1 


30 


INSULIN B CHAIN. 


FT 


NON_CONS 


30 


31 




FT 


CHAIN 


31 


51 


INSULIN A CHAIN . 


FT 


DISULFID 


7 


37 


INTERCHAIN. 


FT 


DISULFID 


19 


50 


INTERCHAIN. 


FT 


DISULFID 


36 


41 




SQ 


SEQUENCE 


51 AA; 


5693 MW; 


901E8 8BA085A7DDD CRC64; 



Query Match 89.6%; 
Best Local Similarity 90.4%; 
Matches 47; Conservative 



Score 263.5; DB 1; Length 51; 
Pred. No. 3.3e-26; 
1; Mismatches 3; Indels 1; Gaps 



Qy 

Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

I I I I II I I I I I I I I I I I I I II I I I I I I I I M I I I I MINIMUM 

1 FANQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCASVCSLYQLENYCN 51 



RESULT 10 
INS CAPHI 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 

oc 



oc 
oc 
ox 

RN 
RP 
RX 
RA 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



(Rel. 01, Created) 

(Rel. 01, Last sequence update) 

(Rel. 42, Last annotation update) 



INS_CAPHI STANDARD; PRT; 51 AA. 

P01319; 
21-JUL-1986 
21-JUL-1986 
10-OCT-2003 
Insulin . 
INS. 

Capra hircus (Goat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 
Bovidae ; Caprinae ; Capra . 
NCBI_TaxID=9925; 
[1] 

SEQUENCE. 

MEDLINE-66160119; PubMed=5 94 95 93 ; 
Smith L. F. ; 

"Species variation in the amino acid sequence of insulin."; 
Am. J. Med. 40:662-666(1966). 

-!- FUNCTION: Insulin decreases blood glucose concentration. It 

increases cell permeability to monosaccharides, amino acids and 
fatty acids. It accelerates glycolysis, the pentose phosphate 
cycle, and glycogen synthesis in liver. 

-!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
disulfide bonds. 

SUBCELLULAR LOCATION: Secreted. 
SIMILARITY: Belongs to the insulin family. 
PIR; A01586; INGT . 
HSSP; P01317; 1APH. 

InterPro; IPR004 825; Ins /IGF/ relax . 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 
Insulin family; Hormone; Glucose metabolism. 



- ! 
_ i . 



CHAIN 


1 


30 


INSULIN B CHAIN. 


NON CONS 


30 


31 




CHAIN 


31 


51 


INSULIN A CHAIN. 


DISULFID 


7 


37 


INTERCHAIN . 


DISULFID 


19 


50 


INTERCHAIN. 


DISULFID 


36 


41 




! SEQUENCE 


51 AA; 


5692 MW; 


9007B50CDB4E7DDD CRC64; 


Query Match 




89.6%; 


Score 263.5; DB 1; Length 51; 


Best Local Similarity 


90.4%; 


Pred. No. 3.3e-26; 


Matches 47; 


Conservative 


1; Mismatches 3; Indels 



1; Gaps 



QY 
Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | | | M I I I II I I M II II I M M I M I I M M I II : I I I I II M I I I 
1 FVNQHLCGSHLVEALYLVCGERGFFYTPKA-GIVEQCCAGVCSLYQLENYCN 51 



RESULT 11 
INS_PIG 

ID INS_PIG STANDARD; PRT; 108 AA. 

AC P01315; Q9TSJ5; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin precursor. 

GN INS . 

OS Sus scrofa (Pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Suina; Suidae; Sus. 

OX NCBI_TaxID=9823; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Han X.G. , Tuch B.E. ; 

RT "Complete porcine preproinsulin cDNA sequence."; 

RL Submitted (MAY-1998) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Large white; 

RX MEDLINE=22135958; PubMed-1214068 6 ; 

RA Amarger V., Nguyen M. , Laere A.S., Braunschweig M. , Nezer C, 

RA Georges M. , Andersson L.; 

RT "Comparative sequence analysis of the INS-IGF2-H19 gene cluster m 

RT pigs."; 

RL Mainm. Genome 13:388-398(2 002). 

RN [3] 

RP SEQUENCE OF 25-108. 

RX MEDLINE=68 2 8 6485; PubMed-5 65 7 063; 

RA Chance R.E., Ellis R.M., Bromer W.W.; 

RT "Porcine proinsulin: characterization and amino acid sequence. ; 

RL Science 161:165-167(1968). 
RN [4] 

RP REVISION TO 59. 

RA Chance R.E. ; 

RL Submitted (JUL-1970) to the PIR data bank. 
RN [5] 

RP X-RAY CRYSTALLOGRAPHY (1.9 ANGSTROMS). 

RA Blundell T.L., Dodson G.G., Hodgkin D., Mercola D.; 

RT "Insulin. The structure in the crystal and its reflection m 

RT chemistry and biology."; 

RL Adv. Protein Chem. 26:279-402(1972). 

RN [6] 

RP X-RAY CRYSTALLOGRAPHY (1.5 ANGSTROMS). 
RA Isaacs N.W., Agarwal R.C.; 

RT "Experience with fast Fourier least squares in the refinement of the 
RT crystal structure of rhombohedral 2-zinc insulin at 1.5-A 
RT resolution. " ; 

RL Acta Crystallogr. A 34:782-791(197 8). 
RN [7] 

RP X-RAY CRYSTALLOGRAPHY (1.5 ANGSTROMS) . 
RX MEDLINE=89099318; PubMed=2 9 054 8 5 ; 

RA Baker E.N., Blundell T.L., Cutfield J.F., Cutfield S.M., Dodson E.J., 
RA Dodson G.G., Crowfoot Hodgkin D.M., Hubbard R.E., Isaacs N.W., 
RA Reynolds CD., Sakabe K. , Sakabe N., Vijayan N.M.; 



RT "The structure of 2Zn pig insulin crystals at 1.5-A resolution."; 
RL Philos. Trans. R. Soc. Lond. , B, Biol. Sci. 319:369-456(1988). 
RN [8] 

RP X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS). 
RX MEDLINE=9212 6280; PubMed-1772 633 ; 

RA Balschmidt P., Hansen F.B., Dodson E., Dodson G. , Korber F. ; 
RT "Structure of porcine insulin cocrystallized with clupeine Z."; 
RL Acta Crystallogr. B 47:975-986(1991). 
RN [9] 

RP X-RAY CRYSTALLOGRAPHY. 

RX MEDLINE=91222450; PubMed=2 025410 ; 

RA Badger J., Harris M.R., Reynolds CD., Evans A.C., Dodson E.J., 
RA Dodson G.G., North A.C.T.; 

RT "Structure of the pig insulin dimer in the cubic crystal. 1 ; 
RL Acta Crystallogr. B 47:127-136(1991). 
RN [10] 

RP X-RAY CRYSTALLOGRAPHY (1.65 ANGSTROMS). 

RA Diao J.-S., Wan Z.-L., Chang W.-R., Liang D.-C; 

"Structure of monomeric porcine DesBl-B2 despentapeptide (B26-B30) 
insulin at 1 . 65-A resolution."; 
ki, Acta Crystallogr. D 53:507-512(1997). 

CC FUNCTION: Insulin decreases blood glucose concentration. It 

cc " increases cell permeability to monosaccharides, amino acids and 
CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

SUBCELLULAR LOCATION: Secreted. 
SIMILARITY: Belongs to the insulin family. 
DATABASE: NAME=Protein Spotlight; 
CC NOTE=Issue 9 of April 2001; 

CC WWW="http: //www. expasy.org/spotlight/articles/sptlt009. html . 



RT 
RT 
RL 
CC 
CC 



CC 
CC 
CC 



CC 
CC 
CC 
CC 



CC 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is m no way 
CC modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 



CC or send an email to license@isb-sib . ch) 



cc 

DR EMBL; AF064555; AAC77920.1; ALTJCNIT, 
DR EMBL; AY044828; AAL69550.1; -. 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



PDB; 


3INS; 


09- 


-JAN- 


89 


PDB; 


4INS; 


31- 


-JUL- 


94 


PDB; 


6INS; 


31- 


-JAN- 


94 


PDB; 


7INS; 


31- 


-JAN- 


94 


PDB; 


9INS; 


15- 


-OCT- 


91 


PDB; 


1IZA; 


15- 


-OCT- 


91 


PDB; 


1IZB; 


15- 


-OCT- 


91 


PDB; 


2TCI; 


29- 


-JAN- 


-96 


PDB; 


IMP J; 


29 


-JAN- 


-96 


PDB; 


3MTH; 


29 


-JAN- 


-96 


PDB; 


1DEI; 


16 


-JUN- 


-97 


PDB; 


1SDB; 


01 


-APR- 


-98 


PDB; 


1WAV; 


28 


-FEB- 


-97 


PDB; 


1ZEI; 


16 


-FEB- 


-99 



UK 


PDB; 1ZNI; 28-JAN-98. 




Dr\ 


InterPro ; 


IPR004 


82 5; Ins/ IGF/relax. 


DR 


Pfam; PF00049; Insulin; 1. 




DR 


PRINTS; PR00277; 


INSULINB. 




DR 


SMART; SMOOCH 8; 


I1GF; 1. 




DR 


PROSITE; 


PS00262; INSULIN; 


1. 


KW 


Insulin family; 


Hormone; Glucose metabolism; 


C 1 


STGNAL 


1 


24 




FT 


CHAIN 


25 


54 


INSULIN B CHAIN 


TT>rp 


PROPEP 


57 


85 


C PEPTIDE. 


FT 


CHAIN 


88 


108 


INSULIN A CHAIN 


FT 


DISULFID 


31 


94 


INTERCHAIN. 


FT 


DISULFID 


43 


107 


INTERCHAIN. 


FT 


DISULFID 


93 


98 




r 1 


HELIX 


26 


46 




FT 


STRAND 


48 


48 




FT 


HELIX 


89 


94 




FT 


HELIX 


100 


106 




FT 


STRAND 


107 


107 


MW; CB4491B429858 


SQ 


SEQUENCE 


108 AA; 11671 



3D-structure , 



Query Match 

Best Local Similarity 



Matches. 51; Conservative 



89.5%; Score 263; DB 1; Length 108; 
60.7%; Pred. No. 7.9e-26; 

0; Mismatches 1; Indels 32; 



Gaps 



l; 



Qy 

Db 

Qy 

Db 



25 



FVNQHLCGSHLVEALYLVCGERGFFYTPKT 

I I I I I I I I I I I I I I I I M I I I M I I I II I 

FWQHLCGSHLVEALYLVCGERGFFYTPKARREAENPQAGAVELGGGLGGLQALALEGPP 



30 



84 



31 --RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I M I 
85 QKRGIVEQCCTSICSLYQLENYCN 10* 



DT 
DT 
DT 



RESULT 12 
INS_RABIT 

ID INS_RABIT STANDARD; PRT; 110 AA. 

AC P01311; 

21-JUL-1986 (Rel. 01, Created) 
01-FEB-1996 (Rel. 33, Last sequence update) 
10-OCT-2003 (Rel. 42, Last annotation update) 
DE Insulin precursor. 
GN INS . 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 
OX NCBI_TaxID=998 6; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=New Zealand white; TISSUE=Pancreas ; 
RX MEDLINE-94179230; PubMed=8132571 ; 

RA Devaskar S.U., Giddings S.J., Rajakumar P. A., Carnaghi L.R., 
RA Menon R.K., Zahm D.S.; 

RT "Insulin gene expression and insulin synthesis in mammalian neuronal 
RT cells."; 

RL J. Biol. Chem. 2 69:8445-8454(1994). 
RN [2] 



RP SEQUENCE OF 25-54 AND 90-110. 

RX MEDLINE=66160119; PubMed=594 9593 ; 

RA Smith L. F. ; 

RT "Species variation in the amino acid sequence of insulin."; 

RL Am. J. Med. 40:662-666(1966). 

RN [3] 

RP SEQUENCE OF 56-110 FROM N . A. 

RA Giddings S.J., Carnaghi L.R., Devaskar S.U.; 

RL Submitted (APR-1991) to the EMBL/GenBank/DDBJ databases^ 

CC -!- FUNCTION: Insulin decreases blood glucose concentration . It 

CC increases cell permeability to monosaccharides, amino acids and 

CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

-!- SIMILARITY: Belongs to the insulin family. 



CC 



Q*Q 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is m no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U03610; AAA19033.1; 

DR EMBL; M61153; AAA17540.1; -. 

DR PIR; A53438; INRB. 

DR HSSP; P01308; 1TYM. 

DR InterPro; IPR004825; Ins/IGF/relax . 

DR Pfam; PF00049; Insulin; 1. 

DR PRINTS; PR00277; INSULINB. 

DR SMART; SM0007 8; I1GF; 1. 

DR PROSITE; PS00262; INSULIN; 1. 

ucose metabolism; Signal. 

INSULIN B CHAIN. 
C PEPTIDE. 
INSULIN A CHAIN. 
INTERCHAIN. 
INTERCHAIN. 

E -> Y (IN REF. 3) . 
[W; 82D2975B85D77FA8 CRC64; 



^ , Pred. No. 8e-26; 

Matches"""5ir^Conservative 1; Mismatches 0; Indels 34; Gaps 1; 

Qy 1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 30 

I I 1 I I I I I I M I I I 1 M I I I I I M I I M I : _ ati?t 
Db 25 FVNQHLCGSHLVEALYLVCGERGFFYTPKSRREVEELQVGQAELGGGPGAGGLQPSALEL 84 

Qy 31 RGI VEQCCT S I C SLYQLEN YCN 52 

M I I M I I I I I I I M I I I I I I I 
Db 85 ALQKRGI VEQCCT S I C SLYQLENYCN 110 



KW 


Insulin family; Hormone; 


FT 


SIGNAL 


1 


24 


FT 


CHAIN 


25 


54 


FT 


PROPEP 


57 


87 


FT 


CHAIN 


90 


110 


FT 


DISULFID 


31 


96 


FT 


DISULFID 


43 


109 


FT 


DISULFID 


95 


100 


FT 


CONFLICT 


83 


83 


SQ 


SEQUENCE 


110 AA; 


1183 




Query Match 




89. 




Best Local 


Similarity 


59. 



RESULT 13 
INS SPETR 



ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

OC 

OC 

OX 

RN 

RP 

RC 
RA 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 

cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



110 AA. 



INS_SPETR STANDARD; PRT; 

Q91XI3; 

10-OCT-2003 (Rel. 42, Created) 
10-OCT-2003 (Rel. 42, Last sequence update) 
10-OCT-2003 (Rel. 42, Last annotation update) 
Insulin precursor. 

INS . i 
Spermophilus tridecemlineatus (Thirteen-lined ground squirrel). 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Sciuridae; Sciunnae; 
Spermophilus . 
NCBI_TaxID=43179; 

[1] 

SEQUENCE FROM N.A. 
TISSUE=Pancreas; 

Tredrea M.M. , Buck M.J., Guhaniyogi J., Squire T.L., Andrews M.T.; 
"Regulation of PDK4 expression in a hibernating mammal."; 
Submitted (JUN-2001) to the EMBL/ GenBank/DDBJ databases. 
-!- FUNCTION: Insulin decreases blood glucose concentration. It 

increases cell permeability to monosaccharides, amino acids and 

fatty acids. It accelerates glycolysis, the pentose phosphate 

cycle, and glycogen synthesis in liver. 
-!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 

disulfide bonds. 

SUBCELLULAR LOCATION: Secreted. 
SIMILARITY: Belongs to the insulin family. 



- ! - 
_ t - 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is m no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL; AY038604; AAK72558.1; 
HSSP; P01308; 1LNP. 

InterPro; IPR004825; Ins/IGF/relax . 
Pfam; PF00049; Insulin; 1. 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 

Insulin family; Hormone; Glucose metabolism; Signal. 



SIGNAL 

CHAIN 

PROPEP 

CHAIN 

DISULFID 

DISULFID 

DISULFID 

SEQUENCE 



1 

25 
57 
90 
31 
43 
95 
110 AA; 



24 BY SIMILARITY. 

54 INSULIN B CHAIN. 

87 C PEPTIDE. 

110 INSULIN A CHAIN. 

96 INTERCHAIN (BY SIMILARITY) . 

109 INTERCHAIN (BY SIMILARITY) . 

100 BY SIMILARITY. 
12004 MW; 4511768D6622BEE5 CRC64; 



Query Match 



89.5%; Score 263; DB 1; Length 110; 



Best Local Similarity 59.3%; Pred. No. 8e-26; 

Matches 51; Conservative 1; Mismatches 0; Indels 34; Gaps 



Qy 1 FWQHLCGSHLVEALYLVCGERGFFYTPKT 

| M I II I I I I I II II I I I I I I I I I I I I I I : 
Db 25 FWQHLCGSHLVEALYLVCGERGFFYTPKSRREVEEQQGGQVELGGGPGAGLPQPLALEM 

q y 31 RGIVEQCCTSICSLYQLENYCN 52 

I I || I I I II II I I II M I I I I I 
Db 85 ALQKRG1VEQCCTSICSLYQLENYCN 110 



RESULT 14 
INS FELCA 
ID 
AC 
DT 
DT 
DT 
DE 
GN 



STANDARD; 



PRT; 



51 AA. 



OS 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RT 
RL 
CC 
CC 

cc 

CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



(Rel. 
(Rel. 
(Rel. 



06, Created) 

06, Last sequence update) 
42, Last annotation update) 



. t - 



INS_FELCA 
P06306; 
01-JAN-1988 
01-JAN-1988 
10-OCT-2003 
Insulin . 
INS . 

Felis silvestris catus (Cat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Carnivora; Fissipedia; Felidae; Felis. 
NCBI_TaxID=9685; 
[1] 

SEQUENCE. 

MEDLINE=8621407 6; PubMed=3518 635 ; 

Hallden G . , Gafvelin G . , Mutt V., Joernvall H.; 

"Characterization of cat insulin."; 

Arch. Biochem. Biophys . 247:20-27(1986). 

FUNCTION: Insulin decreases blood glucose concentration . It 
increases cell permeability to monosaccharides, amino acids and 
fatty acids. It accelerates glycolysis, the pentose phosphate 
cycle, and glycogen synthesis in liver. 

SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
disulfide bonds. 

SUBCELLULAR LOCATION: Secreted. 
SIMILARITY: Belongs to the insulin family. 
PIR; A01588; INCT . 
HSSP; P01317; 1APH. 

InterPro; IPR004825; Ins/lGF/relax. 
PRINTS; PR00277; INSULINB. 
SMART; SM00078; I1GF; 1. 
PROSITE; PS00262; INSULIN; 1. 

Insulin family; Hormone; Glucose metabolism. 

INSULIN B CHAIN. 



INSULIN A CHAIN. 
INTERCHAIN . 
INTERCHAIN . 

9007B5096AOA7DDD CRC64; 



CHAIN 


1 


30 


NON_CONS 


30 


31 


CHAIN 


31 


51 


DISULFID 


7 


37 


DISULFID 


19 


50 


DISULFID 


36 


41 


SEQUENCE 


51 AA; 


5745 MW; 


Query Match 




89.3%; 


Best Local Similarity 


90.4%; 


Matches 47; 


Conservative 



Score 262.5; DB 1; Length 51; 
Pred. No. 4.3e-26; 
2; Mismatches 2; Indels 1; 



Gaps 



DT 
DT 



OC 

oc 



RT 
RT 



Qy i fvnqhlCGSHLVEALYLVCGERGFFYTPKTRGIVEQCCTSICSLYQLENYCN 52 

| | | | | | | | | | M I I I I M I M I I I I M I I Mlllll 1:11 HII 

Db ! fvnqhlcgshlvealylvcgergffytpka-giveqccasvcslyqlehycn di 



RESULT 15 
INS_CANFA 

ID INS__CANFA STANDARD; PRT; 110 AA. 

AC P01321; 

DT 21-JUL-1986 (Rel. 01, Created) 

21-JUL-1986 (Rel. 01, Last sequence update) 
10-OCT-2003 (Rel. 42, Last annotation update) 
DE Insulin precursor. 
GN INS. 

OS Canis familiaris (Dog) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Carnivora; Fissipedia; Canidae; Canis. 
OX NCBI_TaxID=9615 ; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=83109071; PubMed=62 96142 ; 

RA Kwok S.C.M., Chan S.J., Steiner D.F.; 

"Cloning and nucleotide sequence analysis of the dog insulin gene. 
Coded amino acid sequence of canine preproinsulin predicts an 
RT additional C-peptide fragment."; 
RL J. Biol. Chem. 258:2357-23 63(19 83). 
RN [2] 

RP SEQUENCE OF 25-54 AND 90-110. 

RX MEDLINE=66160119; PubMed=594 95 93 ; 

RA Smith L. F. ; 

RT "Species variation in the amino acid sequence of insulin. ; 
RL Am. J. Med. 40:662-666(1966). 

CC -!- FUNCTION: Insulin decreases blood glucose concentration. It 
cc ' increases cell permeability to monosaccharides, amino acids and 
CC fatty acids. It accelerates glycolysis, the pentose phosphate 

CC cycle, and glycogen synthesis in liver. 

CC -!- SUBUNIT: Heterodimer of a B chain and an A chain linked by two 
CC disulfide bonds. 

cc -t_ SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the insulin family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is m no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; V00179; CAA23475.1; -. 
DR PIR; A92413; IPDG. 
DR HSSP; P01317; 1APH. 

DR InterPro; IPR004825; Ins/lGF/relax. 
DR Pfam; PF00049; Insulin; 1. 
DR PRINTS; PR00277; INSULINB. 
DR SMART; SM00078; I1GF; 1. 



CC 
CC 



DR 


PROSITE; 


PS00262; INSULIN; 1 




KW 


Insulin family; Hormone; Glucose metabolism; Signal. 


FT 


SIGNAL 


1 


24 




FT 

£ 1 


CHAIN 


25 


54 


INSULIN B CHAIN. 


FT 




57 


87 


C PEPTIDE. 


FT 






110 


INSULIN A CHAIN. 


FT 


DISULFID 


31 


96 


INTERCHAIN. 


FT 

r i 


DISULFID 


43 


109 


INTERCHAIN . 


FT 


DISULFID 


95 


100 




SQ 


SEQUENCE 


110 AA; 


12190 MW; A574791864A4FB98 CRC64 ; 




Query Match 




89.1%; 


Score 262; DB 1; Length 110; 




Best Local 


Similarity 


59.3%; 


Pred. No. l.le-25; 




Matches 51; Conservative 


0 ; Mismatches 1; Indels 34 ; 



Gaps 



1; 



Qy 

Db 



1 FVNQHLCGSHLVEALYLVCGERGFFYTPKT 

I I I I I I I I I I I I 1 I I I I I I I M I I I I M I 
25 FVNQHLCGSHLVEALYLVCGERGFFYTPKARREVEDLQVRDVELAGAPGEGGLQPLALEG 



30 



84 



QY 
Db 



31 RGIVEQCCTSICSLYQLENYCN 52 

I I I I I I I I I I I I I I I I I I I I I I 
85 ALQKRGIVEQCCTSICSLYQLENYCN 110 



Search completed: July 15, 2004, 16:36:26 
Job time : 4.58955 sees 



